Nucleic acids encoding two-component sensing and regulatory proteins, antimicrobial proteins and uses therefor

ABSTRACT

Stress-related nucleic acid molecules and polypeptides and fragments and variants thereof are disclosed in the current invention. In addition, stress-related fusion proteins, antigenic peptides, and anti-stress-related antibodies are encompassed. The invention also provides recombinant expression vectors containing a nucleic acid molecule of the invention and cells into which the expression vectors have been introduced. Methods for producing the polypeptides and methods of use for the polypeptides of the invention are further disclosed.

CROSS REFERENCE TO RELATED APPLICATION

This application is a divisional of U.S. patent application Ser. No.12/046,080, filed Mar. 11, 2008, which is a divisional of U.S. patentapplication Ser. No. 11/199,489, filed Aug. 8, 2005 and claims thebenefit of U.S. Provisional Application Ser. No. 60/599,972, filed Aug.9, 2004, the contents of which are herein incorporated by reference intheir entirety.

REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY

The official copy of the sequence listing is submitted electronicallyvia EFS-Web as an ASCII formatted sequence listing with a file named341683seqlist.txt, created on Apr. 30, 2010, and having a size of 563 KBand is filed concurrently with the specification. The sequence listingcontained in this ASCII formatted document is part of the specificationand is herein incorporated by reference in its entirety.

FIELD OF THE INVENTION

This invention relates to polynucleotides and polypeptides encoded bythem, as well as methods for using the polypeptides and microorganismsproducing them.

BACKGROUND OF THE INVENTION

Lactobacillus acidophilus is a Gram-positive, rod-shaped, non-sporeforming, homofermentative bacterium that is a normal inhabitant of thegastrointestinal and genitourinary tracts. Since its original isolationby Moro (1900) from infant feces, the “acid loving” organism has beenfound in the intestinal tract of humans, breast-fed infants, and personsconsuming high milk, lactose, or dextrin diets. Historically, L.acidophilus is the Lactobacillus species most often implicated as anintestinal probiotic capable of eliciting beneficial effects on themicroflora of the gastrointestinal tract (Klaenhammer and Russell (2000)“Species of the Lactobacillus acidophilus complex” Encyclopedia of FoodMicrobiology, Volume 2, pp. 1151-1157. Robinson et al., eds. (AcademicPress, San Diego, Calif.). L. acidophilus can ferment hexoses, includinglactose and more complex oligosaccharides, to produce lactic acid andlower the pH of the environment where the organism is cultured.Acidified environments (e.g., food, vagina, and regions within thegastrointestinal tract) can interfere with the growth of undesirablebacteria, pathogens, and yeasts. The organism is well known for its acidtolerance, survival in cultured dairy products, and viability duringpassage through the stomach and gastrointestinal tract. Lactobacilli andother commensal bacteria, some of which are considered as probioticbacteria that “favor life,” are generally recognized for their role inflavor and aroma development and to spoilage retardation in fermentedfood products, and have been studied extensively for their effects onhuman health, particularly in the prevention or treatment of entericinfections, diarrheal disease, prevention of cancer, and stimulation ofthe immune system.

During fermentation, lactic acid bacteria are exposed to toxicbyproducts of their growth, such as lactic acid and hydrogen peroxide,antimicrobial agents produced by neighboring microorganisms, and theharsh environmental conditions that is encountered during properfermentation of a raw food item. They must also adapt to the extremeconditions found in the stomach during ingestion, and severetemperatures associated with storage or production conditions, as wellas compete with other microorganisms for resources. These bacteria haveevolved sensory and regulatory mechanisms, which enable them to monitorexternal conditions and respond accordingly. One such mechanism isreferred to as the “two-component” system, and is structured around twoproteins: a histidine protein kinase and a response regulator protein.Furthermore, one of the major responses controlled by these sensory andregulatory systems of these bacteria is the production of their ownantimicrobial agents, of which bacteriocins are an example.Two-component regulatory systems have been shown to control many diverseprocesses in bacteria, such as sporulation, chemotaxis, nitrogenassimilation, outer membrane protein expression, response to osmolarity,regulation of competence and virulence, as well as the production ofantimicrobials.

Microorganisms that can respond to changes in the environment, such asthose present during commercial fermentation and storage, as well asthose microorganisms that can compete more effectively with othermicroorganisms are advantageous. Therefore, isolated nucleic acidsequences encoding these proteins are desirable for use in engineeringmicroorganisms, including Lactobacillus acidophilus, to have anincreased ability to tolerate changes in growth environment and animproved ability to inhibit food-borne pathogens.

BRIEF SUMMARY OF THE INVENTION

Compositions and methods for modifying Lactobacillus organisms areprovided. Compositions of the invention include isolated nucleic acidmolecules encoding proteins involved in and those produced under thecontrol of two-component sensing and regulatory systems.

Compositions comprise isolated nucleic acid molecules comprising a) anucleic acid molecule comprising any one of even numbered SEQ IDNOS:1-164; b) a nucleic acid molecule comprising a nucleotide sequencehaving at least 80% sequence identity to any one of even numbered SEQ IDNOS:1-164; c) a nucleic acid molecule that encodes a polypeptidecomprising the amino acid sequence as set forth in any one of oddnumbered SEQ ID NOS:1-164; d) a nucleic acid molecule comprising anucleotide sequence encoding a polypeptide having at least 80% aminoacid sequence identity to the amino acid sequence as set forth in anyone of odd numbered SEQ ID NOS:1-164; and e) a complement of any ofa)-d).

Additional compositions include a polypeptide selected from the groupconsisting of a) a polypeptide comprising the amino acid sequence as setforth in any one of odd numbered SEQ ID NOS:1-164; b) a polypeptidecomprising an amino acid sequence having at least 80% sequence identityto the amino acid sequence as set forth in any one of odd numbered SEQID NOS:1-164, wherein said polypeptide retains activity; c) apolypeptide encoded by the nucleotide sequence as set forth in any oneof odd numbered SEQ ID NOS:1-164; and d) a polypeptide that is encodedby a nucleic acid molecule comprising a nucleotide sequence having atleast 80% sequence identity to the nucleotide sequence as set forth inany one of odd numbered SEQ ID NOS:1-164.

Variant nucleic acid molecules, peptides and polypeptides sufficientlyidentical to and/or functionally equivalent to the nucleotide and aminoacid sequences set forth in the attached Sequence Listing areencompassed by the present invention. Additionally, fragments andsufficiently identical fragments of the nucleotide and amino acidsequences are encompassed. Nucleotide sequences that are complementaryto a nucleotide sequence of the invention, or that hybridize to asequence of the invention, are also encompassed.

Compositions of this invention further include vectors and cellscomprising the nucleic acid molecules described herein, as well as,cells and transgenic microbial populations comprising the vectors. Alsoincluded in the invention are methods for the recombinant production ofthe peptides and polypeptides of the invention, and methods for theiruse. Further included are methods and kits for detecting the presence ofa nucleic acid and/or peptide and/or polypeptide sequence of theinvention in a sample. Additionally provided are antibodies that bind toa peptide and/or polypeptide of the invention, methods of making theantibodies of this invention and methods for using the antibodies ofthis invention to detect a peptide and/or polypeptide of this invention.

Compositions also provided herein include a polypeptide of the inventionfurther comprising one or more heterologous amino acid sequences, andantibodies that selectively bind to a polypeptide of the invention.

The two-component sensing and regulatory response molecules andmolecules under the control of two-component sensing and regulatoryresponse molecules of the present invention are useful for the selectionand production of recombinant bacteria, particularly the production ofbacteria with improved ability to survive under stressful conditions.

Additionally provided herein are methods for producing a polypeptide,comprising culturing a cell of the invention under conditions in which anucleic acid molecule encoding the polypeptide is expressed, saidpolypeptide being selected from the group consisting of: a) apolypeptide comprising the amino acid sequence as set forth below; b) apolypeptide encoded by the nucleic acid sequence as set forth below; c)a polypeptide comprising an amino acid sequence having at least 80%sequence identity to the amino acid sequence as set forth below, whereinsaid polypeptide retains activity; and d) a polypeptide encoded by anucleotide sequence having at least 80% sequence identity to thenucleotide sequence as set forth below, wherein said polypeptide retainsactivity.

Additionally provided are methods for detecting the presence of apolypeptide of the invention in a sample comprising contacting thesample with a compound that selectively binds to a polypeptide anddetermining whether the compound binds to the polypeptide in the sample.

Further provided are methods for detecting the presence of a polypeptidein a sample wherein the compound that binds to the polypeptide is anantibody, as well as kits comprising a compound for use in methods ofthe invention for detecting the presence of a polypeptide in a sampleand instructions for use.

The present invention also provides methods for detecting the presenceof a nucleic acid molecule and/or fragments thereof, of this inventionin a sample, comprising: a) contacting the sample with a nucleic acidprobe or primer that selectively hybridizes to the nucleic acidmolecule; and b) detecting hybridization of the nucleic acid probe orprimer with the nucleic acid molecule.

Also provided are methods for detecting the presence of a nucleic acidmolecule and/or fragment of the invention in a sample wherein the samplecomprises mRNA molecules and is contacted with a nucleic acid probe.Additionally provided herein is a kit comprising a compound thatselectively hybridizes to a nucleic acid of the invention, andinstructions for use.

Further provided herein are methods for increasing the ability of amicroorganism to survive stressful conditions, comprising introducinginto said microorganism a nucleic acid molecule of the invention andexpressing the nucleic acid molecule. In specific embodiments, thenucleotide sequence encodes a protein of a two-component regulatorysystem, a histidine protein kinase and/or a response regulator of atwo-component regulatory system, a protein under the control of atwo-component regulatory system, or a bacteriocin. In further aspects ofthe invention, the stressful conditions comprise osmotic stress,oxidative stress and/or starvation conditions.

Methods are also provided herein for enhancing the ability of amicroorganism to survive passage through the gastrointestinal tract,comprising introducing into the microorganism a nucleic acid moleculecomprising at least one nucleotide sequence selected from the groupconsisting of: a) the nucleotide sequence as set forth in any one of oddnumbered SEQ ID NO:1-164; b) a nucleotide sequence encoding apolypeptide comprising the amino acid sequence as set forth in any oneof even numbered SEQ ID NO:1-164; c) a nucleotide sequence that is atleast 80% identical to the sequence as set forth in any one of oddnumbered SEQ ID NO:1-164, wherein said nucleotide sequence encodes apolypeptide that retains activity; and, d) a nucleotide sequenceencoding a polypeptide comprising an amino acid sequence having at least80% sequence identity to the amino acid sequence as set forth in any oneof even numbered SEQ ID NO:1-164, wherein said polypeptide retainsactivity.

Methods are also provided herein for enhancing the ability of amicroorganism to survive passage through the gastrointestinal tract,comprising introducing into the microorganism at least one nucleic acidmolecule of the invention. In specific embodiments, the nucleotidesequence encodes a protein of a two-component regulatory system, ahistidine protein kinase of a two-component regulatory system, aresponse regulator of a two-component regulatory system, a bacteriocin,and/or encodes a protein under the control of a two-component regulatorysystem.

Additional aspects of the invention comprise methods for increasing theability of a microorganism to survive in the presence of anantimicrobial, comprising introducing into said microorganism a nucleicacid molecule comprising at least one nucleotide sequence of theinvention. In specific embodiments, the nucleotide sequence encodes aprotein of a two-component regulatory system, the nucleotide sequenceencodes a histidine protein kinase of a two-component regulatory systemand/or a response regulator of a two-component regulatory system, and/orthe nucleotide sequence encodes a protein or proteins that is under thecontrol of a two-component regulatory system.

Also provided are methods for enabling an organism to respond toenvironmental stimuli, comprising introducing into the organism a vectorcomprising at least one nucleotide sequence of the invention. Inspecific embodiments, the nucleotide sequence encodes a protein of atwo-component regulatory system, a histidine protein kinase of atwo-component regulatory system, a response regulator of a two-componentregulatory system, a bacteriocin, and/or encodes a protein under thecontrol of a two-component regulatory system. The environmental stimulican be selected from the group consisting of turgor pressure, a chemicalstimulus, heavy-metal cations, oxygen, iron, an antimicrobial compound,various carbohydrates, including glucose.

Yet another embodiment of the invention comprises a Lactobacillusacidophilus cell with an increased ability to survive stressfulconditions compared to a wild-type Lactobacillus acidophilus cell,wherein said increased ability to survive stressful conditions is theresult of overexpression of a nucleic acid molecule encoding an aminoacid sequence as set forth herein. In specific embodiments, thestressful conditions comprise osmotic stress, oxidative stress,starvation, or the presence of antimicrobials.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the organization of 1524HPK-1525RR two-component regulatorysystem in Lactobacillus acidophilus NCFM. The disrupted HPK gene isrepresented by a grey arrow. Putative terminator regions and theircalculated free energy are indicated by hairpin structures. The start,putative ribosome binding site, potential promoter and transcriptionstart are indicated. The sequence is set forth in SEQ ID NO:166.

FIG. 2A shows the survival of Lactobacillus acidophilus NCK1398(NCFMΔlacL, squares) and the HPK mutant NCK1686 (circles) in MRSadjusted to pH 3.5 with lactic acid.

FIG. 2B shows the survival of Lactobacillus acidophilus NCK1398(NCFMΔlacL, squares) and the HPK mutant NCK1686 (circles) followingexposure to pH 5.5 (open symbols) or pH 6.8 (filled symbols) for 1 hprior to challenge at pH 3.5 (adjusted with lactate).

FIG. 3 shows the organization of the oligopeptide transport (opp)operons in Lactobacillus acidophilus NCFM. Predicted rho-independentterminators with a free energy over −10 kcal/mol (continuous line) andunder −10 Kcal/mol (dotted line) are indicated.

FIG. 4A shows the growth of Lactobacillus acidophilus NCFM (▪) andNCK1686 (NCFMΔ1524HPK, ) in milk (filled symbols) and milk supplementedwith yeast extract (open symbols).

FIG. 4B shows the growth of Lactobacillus acidophilus NCFM (▪) andNCK1686 (NCFMΔ1524HPK, ) in milk (filled symbols) and milk supplementedwith 0.25% casamino acids. r=0.99

FIG. 5A shows a Northern blot analysis of seven genes which wasperformed using RNA isolated in three independent experiments fromLactobacillus acidophilus NCK1398 (NCFMΔlacL) and NCK1686 (NCFMΔ1524HPK)exposed to pH 6.8, 5.5 and 4.5 for 30 minutes. RNA ratios werecalculated from data obtained by the Northern blot by densitometryanalysis.

FIG. 5B shows a comparison of expression measurements by Microarray andNorthern blot analysis. The correlation coefficient for each conditionis given in the figure.

FIG. 6A and 6 shows a bacteriocin assay which compares the wildtype NCFM(A) versus the NCFM integrant.

FIG. 7 shows the ratio of maximum growth rates in MRS compared toMRS+Oxgall. Bars with * represent significantly different means withintheir group. Error bars represent the standard error of the mean.

FIG. 8 provides a series of growth curves using 0.3% of individual bilesalts: no salt (A), sodium taurocholate (B), sodium taurodeoxycholate(C) and sodium taurochendodeoxycholate (D).

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the inventions are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation. Thepresent invention relates to two-component sensing and regulatory systemproteins and proteins under the control of the two-component regulatorysystem. These proteins include, but are not limited to, histidineprotein kinases, response regulators and bacteriocins. Examples ofnucleic acid sequences encoding two-component sensing and regulatorysystem, related antimicrobial proteins and proteins under the control oftwo-component sensing and regulatory molecules are provided in Table 1.

Two-component regulatory system molecules and molecules expressed underthe control of two-component regulatory system molecules are provided.The full-length gene sequences, referred to as “two-component regulatorysystem sequences,” have similarity to two-component regulatory systemgenes. The invention further provides fragments and variants of thesetwo-component regulatory system sequences, which can also be used topractice the methods of the present invention. As used herein, the terms“gene” and “recombinant gene” refer to nucleic acid molecules comprisingan open reading frame, particularly those encoding a two-componentregulatory system protein. Isolated nucleic acid molecules of thepresent invention comprise nucleic acid sequences encoding two-componentregulatory system proteins and proteins under the control oftwo-component regulatory system proteins, nucleic acid sequencesencoding the amino acid sequences set forth in SEQ ID NOS:2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44,46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 80, 82,84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114,116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142,146, 148, 150, 152, 154, 156, 158, 160, 162, 164, the nucleic acidsequences set forth in SEQ ID NOS:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, 29, 31, 33, 35, 37, 39, 43, 45, 47, 49, 51, 53, 55, 57, 59,61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95,97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125,127, 129, 131, 132, 135, 137, 139, 1143, 145, 147, 149, 151, 153, 155,157, 159, 161, 163, 164, and variants and fragments thereof. The presentinvention also encompasses antisense nucleic acid molecules, asdescribed herein.

In addition, isolated peptides, polypeptides and proteins of atwo-component regulatory system or that are produced under the controlof a two-component regulatory system, and variants and fragments thereofare encompassed, as well as, methods for producing all of these. Forpurposes of the present invention, the terms “protein” and “polypeptide”are used interchangeably. A representative amino acid sequence of thepresent invention is set forth in SEQ ID NO:2. In some embodiments,peptides and/or polypeptides of the present invention affect astress-related protective activity. Stress-related protective activityrefers to a biological or functional activity as determined in vivo orin vitro according to standard assay techniques. These techniques couldinvolve, for example, measuring bacterial survival or growth underadverse environmental conditions. See, for example, Varcamonti et al.(2003) Appl. Environ. Microbiol. 69:1287-1289, herein incorporated byreference. By “adverse environmental conditions” or “stressfulenvironmental conditions” is meant an environmental condition or statethat is not conducive for growth of the microorganism, and includes, butis not limited to, acidic conditions, alkaline conditions, non-optimalosmotic stress conditions, non-optimal oxidative stress conditions,starvation conditions, and in the presence of antimicrobials.

As used herein, the terms peptide and polypeptide are used to describe achain of amino acids, which correspond to those encoded by a nucleicacid. A peptide usually describes a chain of amino acids of from two toabout 30 amino acids and polypeptide usually describes a chain of aminoacids having more than about 30 amino acids. The term polypeptide canrefer to a linear chain of amino acids or it can refer to a chain ofamino acids, which have been processed and folded into a functionalprotein. It is understood, however, that 30 is an arbitrary number withregard to distinguishing peptides and polypeptides and the terms may beused interchangeably for a chain of amino acids around 30. The peptidesand polypeptides of the present invention are obtained by isolation andpurification of the peptides and polypeptides from cells where they areproduced naturally or by expression of a recombinant and/or syntheticnucleic acid encoding the peptide or polypeptide. The peptides andpolypeptides of this invention can be obtained by chemical synthesis, byproteolytic cleavage of a polypeptide and/or by synthesis from nucleicacid encoding the peptide or polypeptide.

It is also understood that the peptides and polypeptides of thisinvention may also contain conservative substitutions where a naturallyoccurring amino acid is replaced by one having similar properties andwhich does not alter the function of the polypeptide. Such conservativesubstitutions are well known in the art. Thus, it is understood that,where desired, modifications and changes, which are distinct from thesubstitutions which enhance immunogenicity, may be made in the nucleicacid and/or amino acid sequence of the peptides and polypeptides of thepresent invention and still obtain a peptide or polypeptide having likeor otherwise desirable characteristics. Such changes may occur innatural isolates or may be synthetically introduced using site-specificmutagenesis, the procedures for which, such as mis-match polymerasechain reaction (PCR), are well known in the art. One of skill in the artwill also understand that polypeptides and nucleic acids that containmodified amino acids and nucleotides, respectively (e.g., to increasethe half-life and/or the therapeutic efficacy of the molecule), can beused in the methods of the invention.

The nucleic acid and protein compositions encompassed by the presentinvention are isolated or substantially purified. By “isolated” or“substantially purified” is intended that the nucleic acid or proteinmolecules, or biologically active fragments or variants, aresubstantially or essentially free from components normally found inassociation with the nucleic acid or protein in its natural state. Suchcomponents include other cellular material, culture media fromrecombinant production, and various chemicals used in chemicallysynthesizing the proteins or nucleic acids. Preferably, an “isolated”nucleic acid of the present invention is free of nucleic acid sequencesthat flank the nucleic acid of interest in the genomic DNA of theorganism from which the nucleic acid was derived (such as codingsequences present at the 5′ or 3′ ends). However, the molecule caninclude some additional bases or moieties, which do not deleteriouslyaffect the basic characteristics of the composition. For example, invarious embodiments, the isolated nucleic acid contains less than 5 kb,4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleic acid sequencenormally associated with the genomic DNA in the cells from which it wasderived. Similarly, an isolated or substantially purified protein hasless than about 30%, 20%, 10%, 5%, or 1% (by dry weight) ofcontaminating protein, or non-two-component regulatory protein. When theprotein is recombinantly produced, preferably culture medium representsless than 30%, 20%, 10%, or 5% of the volume of the protein preparation,and when the protein is produced chemically, preferably the preparationshave less than about 30%, 20%, 10%, or 5% (by dry weight) of chemicalprecursors, or non-two-component regulatory chemicals.

The compositions and methods of the present invention can be used tomodulate the function of the two-component regulatory molecules of theinvention or the sequences under the control of the two componentsensing or regulatory molecules. By “modulate,” “alter,” or “modify” isintended the up- or down-regulation of a target biological activity. Inaccordance with the present invention, the level or activity of asequence of the invention is modulated (i.e., overexpressed orunderexpressed) if the level and/or activity of the sequence isstatistically lower or higher than the level and/or activity of the samesequence in an appropriate control. Concentration and/or activity can beincreased or decreased by at least 0.5%, 1%, 5%, 10%, 20%, 30%, 40%,50%, 60%, 70%, 80%, or 90% relative to an appropriate control. Proteinsof the invention are useful in modifying the biological activities oflactic acid bacteria, especially lactic acid bacteria that are used toferment foods with nutritional or health-promoting characteristics.Nucleic acid molecules of the invention are useful in modulatingproduction of the sequences of the invention by lactic acid bacteria.Up- or down-regulation of expression of a nucleic acid of the presentinvention is encompassed. Up-regulation can be accomplished by providingmultiple nucleic acid copies, modulating expression by modifyingregulatory elements, promoting transcriptional or translationalmechanisms, or other means. Down-regulation can be accomplished by usingknown antisense and gene silencing techniques. By “lactic acid bacteria”is intended bacteria from a genus selected from the following:Aerococcus, Carnobacterium, Enterococcus, Lactococcus, Lactobacillus,Leuconostoc, Oenococcus, Pediococcus, Streptococcus, Melissococcus,Alloiococcus, Dolosigranulum, Lactosphaera, Tetragenococcus, Vagococcus,and Weissella (Holzapfel et al. (2001) Am. J. Clin. Nutr. 73:365S-373S;Bergey's Manual of Systematic Bacteriology, Vol. 2 (Williams andWilkins, Baltimore (1986)) pp. 1075-1079).

Microorganisms expressing the nucleic acid molecules to produce thepolypeptides of the present invention are useful as additives in dairyand fermentation processing. The nucleic acid sequences, encodedpolypeptides, and microorganisms expressing them are useful in themanufacture of milk-derived products, such as cheeses, yogurt, fermentedmilk products, sour milks, and buttermilk. Microorganisms that producepolypeptides of the invention may be probiotic organisms. By “probiotic”is intended a live microorganism that survives passage through thegastrointestinal tract and has a beneficial effect on the subject. By“subject” is intended an organism that comes into contact with amicroorganism producing a protein of the present invention. Subject mayrefer to humans and other animals.

In addition to the sequences disclosed herein, and fragments andvariants thereof, the isolated nucleic acid molecules of the currentinvention also encompass homologous nucleic acid sequences identifiedand isolated from other organisms or cells by hybridization with entireor partial sequences obtained from the two-component regulatorynucleotide sequences disclosed herein, or variants and fragmentsthereof.

In another embodiment of the invention, nucleotide sequences andfragments thereof that are expressed under the control of proteins andpolypeptides of a two-component regulatory system and the proteins andpolypeptides encoded by those nucleotide sequences are provided. In apreferred embodiment, the protein or polypeptide produced from anucleotide sequence under control of a two-component regulatory systemis a bacteriocin. By “bacteriocin” is intended a group of polypeptidesproduced by a bacterium as an antimicrobial substance. Included in thisgroup are: Class I bacteriocins or lantibiotics which contain theunusual amino acids lantionine, β-methyl-lanthionine and dehydratedresidues dehydroalanine and dehydrobutyrine; Class II bacteriocins,i.e., small heat-stable, non-lanthionine containing, membrane-activepeptides; and Class III bacteriocins, i.e., large, heat-labile proteins.

Fragments and Variants

The invention provides isolated nucleic acid molecules comprisingnucleotide sequences encoding two-component regulatory proteins, as wellas peptides and/or proteins encoded thereby. By “two-componentregulatory protein” or “two-component sensing protein” is meant proteinscomprising, consisting of and/or consisting essentially of the aminoacid sequences set forth in even numbered SEQ ID NOS:1-38. By “proteinsunder the control of two-component sensing and regulatory molecules” ismeant proteins having the amino acid sequences set forth in evennumbered SEQ ID NOS:40-164. Fragments and variants of these nucleotidesequences and encoded proteins are also provided. By “fragment” of anucleotide sequence or protein is intended a portion of the nucleotideor amino acid sequence.

Fragments and variants of the nucleic acid molecules disclosed hereincan be used as hybridization probes to identify two-component regulatoryprotein-encoding nucleic acids and/or proteins under the control oftwo-component sensing and regulatory molecules, or they can be used asprimers in amplification protocols (e.g., polymerase chain reaction) ormutation of two-component regulatory nucleic acid molecules, proteinsunder the control of two-component sensing and regulatory moleculesand/or stress-related nucleic acid molecules. Such fragments or variantsneed not encode function polypeptides. Fragments of nucleic acids canalso be bound to a physical substrate to comprise a macro- or microarray(for example, U.S. Pat. No. 5,837,832; U.S. Pat. No. 5,861,242). Sucharrays of nucleic acids can be used to study gene expression or toidentify nucleic acid molecules with sufficient identity to the targetsequences.

By “nucleic acid molecule” is meant DNA molecules (e.g., cDNA or genomicDNA) and RNA molecules (e.g., mRNA) and analogs of the DNA or RNAgenerated using nucleotide analogs. The nucleic acid molecule can besingle-stranded or double-stranded, but preferably is double-strandedDNA. A fragment of a nucleic acid molecule encoding a protein of theinvention may encode a protein fragment that is biologically active, orit may be used as a hybridization probe or PCR primer as describedbelow. A biologically active fragment of a polypeptide disclosed hereincan be prepared by isolating a portion of one of the nucleotidesequences of the invention, expressing the encoded portion of theprotein (e.g., by recombinant expression in vitro), and assessing theactivity of the encoded portion.

Fragments of nucleic acid molecules of the invention comprise at leastabout 15, 20, 50, 75, 100, 200, 250, 300, 350, 400, 450, 500, 550, 600,650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1400, 1600, 1800,2000, 2200, 2415 nucleotides (for example, 714 for SEQ ID NO:1, 1854 forSEQ ID NO:3, etc.), including any value between these numbers recitedhere, e.g., 36 nucleotides or 423 nucleotides up to the total number ofnucleotides present in a full-length nucleotide sequence as disclosedherein.

Fragments of amino acid sequences of this invention can includepolypeptide fragments that function as immunogens for example, for theproduction of antibodies to two-component regulatory system proteins orto proteins under the control of two-component sensing and regulatorymolecules. Fragments of this invention include peptides comprising aminoacid sequences sufficiently identical to and/or derived from the aminoacid sequence of a protein of the invention, or partial-length proteinof the invention and exhibiting at least one activity of the protein,but which include fewer amino acids than the full-length proteinsdisclosed herein. Typically, biologically active fragments of thisinvention comprise a domain or motif with at least one activity of theprotein. A biologically active portion or fragment of a two-componentregulatory protein or a protein under the control of two-componentsensing and regulatory molecules can be a peptide or polypeptide thatis, for example, 10, 25, 50, 100, 150, 200, 250, 300, 400, 500, 600,700, 805 contiguous amino acids in length, or up to the total number ofamino acids present in a full-length protein of the current invention(for example, 238 for SEQ ID NO:2, 618 for SEQ ID NO:4, etc.), includingany value in between these explicitly listed herein, e.g., 17 aminoacids or 106 amino acids up to the total number of amino acids presentin a full-length protein sequence of the invention. Such biologicallyactive fragments can be prepared by recombinant techniques and evaluatedfor one or more of the functional activities according to standardprotocols. As used herein, a fragment can comprise at least 5 contiguousamino acids of even numbered SEQ ID NOS:1-164. The invention encompassesother fragments, however, such as any fragment of a protein of thisinvention comprising greater than 6, 7, 8, or 9 amino acids.

Variants of the nucleotide and amino acid sequences are encompassed inthe present invention. By “variant” is meant a sufficiently identicalsequence. Accordingly, the invention encompasses isolated nucleic acidmolecules that are sufficiently identical to the nucleotide sequences ofthe invention set forth in the odd numbered SEQ ID NOS:1-164, or nucleicacid molecules that hybridize to a nucleic acid molecule of odd numberedSEQ ID NOS:1-164, or a complement thereof, under stringent conditions.Variants also include variant polypeptides encoded by the nucleotidesequences of the present invention. In addition, polypeptides of thecurrent invention have an amino acid sequence that is sufficientlyidentical to an amino acid sequence set forth in even numbered SEQ IDNOS:1-164. By “sufficiently identical” is meant that one amino acid ornucleotide sequence contains or encodes a sufficient or minimal numberof equivalent or identical amino acid residues or nucleotides ascompared to a second amino acid or nucleotide sequence, thus providing acommon structural domain and/or a common functional activity.Conservative variants include those sequences that differ due to thedegeneracy of the genetic code.

In general, amino acids or nucleotide sequences that have at least about45%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%,96%, 97%, 98%, 99%, or 99.5% sequence identity to any of the amino acidsequences of even numbered SEQ ID NOS:1-164 or any of the nucleotidesequences of odd numbered SEQ ID NOS:1-164, respectively, are definedherein as sufficiently identical. Variant proteins encompassed by thepresent invention are biologically active, that is they retain a desiredbiological activity of the native protein. Such activities are discussedin detail elsewhere herein. By “two-component regulatory systemactivity” is intended the ability of an organism to respond to anenvironmental stimuli to enable the organism to better survive. Thisencompasses both stressful environmental conditions, as described above,and beneficial environmental conditions, wherein a molecule desired bythe organism is present, such as glucose. Assays to measure the activityof two-component regulatory system proteins or the proteins under thecontrol of the two-component sensing and regulatory molecules are wellknown in the art. See, for example, Lee et al. (2004) Infect. Immun.72:3968-3973; Walker and Miller (2004) J. Bacteriol. 186:4056-4066;Saini et al. (2004) Microbiology. 150:865-875; Abo-Amer et al. (2004) J.Bacteriol. 186:1879-1889. A biologically active variant of a protein ofthe invention can differ from that protein by as few as 1-15 amino acidresidues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2,or even 1 amino acid residue.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of thenucleotide sequences set forth in 3, 7, 13, 15, 19, 23, 29, 33, and 35,which can encode a histidine kinase. Variants of such nucleotidesequences are also included including sequences that have at least about45%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%,96%, 97%, 98%, 99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of thenucleotide sequences set forth in 73, 75, 85, 89, 91, 95, and 113, whichcan encode a bacteriocin. Variants of such nucleotide sequences are alsoincluded including sequences that have at least about 45%, 55%, 65%,70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, 96%, 97%, 98%,99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of thenucleotide sequences set forth in 1, 9, 11, 17, 21, 25, 27, 31, and 37,which can encode a response regulator. Variants of such nucleotidesequences are also included including sequences that have at least about45%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%,96%, 97%, 98%, 99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of thenucleotide sequences set forth in 5, 49, 51, 53, 73, 75, 77, 79, 81, 83,85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 151, 153,155, 157, 159, 161, and 163, which can encode a polypeptide producedunder the control of a two-component regulatory system. Variants of suchnucleotide sequences are also included including sequences that have atleast about 45%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,95%, 95%, 96%, 97%, 98%, 99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of the aminoacid sequences set forth in 4, 8, 14, 16, 20, 24, 30, 34, and 36, whichcan encode a histidine kinase. Variants of such amino acid sequences arealso included including sequences that have at least about 45%, 55%,65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, 96%, 97%,98%, 99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of the aminoacid sequences set forth in 74, 76, 90, 92, 96, and 114, which canencode a bacteriocin. Variants of such amino acid sequences are alsoincluded including sequences that have at least about 45%, 55%, 65%,70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, 96%, 97%, 98%,99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of the aminoacid sequences set forth in 2, 10, 12, 18, 22, 26, 28, 32, and 38, whichcan encode a response regulator. Variants of such amino acid sequencesare also included including sequences that have at least about 45%, 55%,65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, 96%, 97%,98%, 99%, or 99.5%.

In one embodiment, the sequence according to the present invention orfor use in the methods of the invention may be one or more of the aminoacid sequences set forth in 6, 50, 52, 54, 74, 76, 78, 80, 82, 84, 66,68, 90, 92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 152, 154,156, 158, 160, 163, and 164, which can encode a polypeptide producedunder the control of a two-component regulatory system. Variants of suchamino acid sequences are also included including sequences that have atleast about 45%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,95%, 95%, 96%, 97%, 98%, 99%, or 99.5%.

Full-length or partial nucleic acid sequences can be used to obtainhomologues and orthologs encompassed by the present invention. By“orthologs” is intended genes derived from a common ancestral gene andwhich are found in different species as a result of speciation. Genesfound in different species are considered orthologs when theirnucleotide sequences and/or their encoded amino acid sequences sharesubstantial identity as defined elsewhere herein. Functions of orthologsare often highly conserved among species.

Naturally occurring variants can exist within a population (e.g., theLactobacillus acidophilus population). Such variants can be identifiedby using well-known molecular biology techniques, such as the polymerasechain reaction (PCR), and hybridization as described herein.Synthetically derived nucleotide sequences, for example, sequencesgenerated by site-directed mutagenesis or PCR-mediated mutagenesis thatstill encode a two-component regulatory protein, are also included asvariants. One or more nucleotide or amino acid substitutions, additions,and/or deletions can be introduced into a nucleotide or amino acidsequence disclosed herein, such that the substitutions, additions, ordeletions are introduced into the encoded protein. The additions(insertions) and/or deletions (truncations) can be made at theN-terminal and/or C-terminal end of the native protein, and/or at one ormore sites in the native protein. Similarly, a substitution of one ormore nucleotides or amino acids can be made at one or more sites in thenative protein.

For example, conservative amino acid substitutions can be made at one ormore predicted, preferably nonessential amino acid residues. A“nonessential” amino acid residue is a residue that can be altered fromthe wild-type sequence of a protein without altering the biologicalactivity, whereas an “essential” amino acid is required for biologicalactivity. A “conservative amino acid substitution” is one in which theamino acid residue is replaced with an amino acid residue with a similarside chain. Families of amino acid residues having similar side chainsare known in the art. These families include amino acids with basic sidechains (e.g., lysine, arginine, histidine), acidic side chains (e.g.,aspartic acid, glutamic acid), uncharged polar side chains (e.g.,glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine),nonpolar side chains (e.g., alanine, valine, leucine, isoleucine,proline, phenylalanine, methionine, tryptophan), beta-branched sidechains (e.g., threonine, valine, isoleucine) and aromatic side chains(e.g., tyrosine, phenylalanine, tryptophan, histidine). Suchsubstitutions would not be made for conserved amino acid residues, orfor amino acid residues residing within a conserved motif, where suchresidues are essential for protein activity.

Alternatively, mutations can be made randomly along all or part of thelength of the two-component regulatory coding sequence or along all orpart of the length of the sequences under the control of two-componentsensing and regulatory molecules, such as by saturation mutagenesis. Themutants can be expressed recombinantly, and screened for those thatretain biological activity e.g., by assaying for two-componentregulatory system activity using standard assay techniques. Methods formutagenesis and nucleotide sequence alterations are known in the art.See, for example, Kunkel (1985) Proc. Natl. Acad. Sci. USA 82:488-492,Kunkel et al. (1987) Methods in Enzymol. Molecular Biology (MacMillanPublishing Company, New York) and the references sited therein.Obviously the mutations made in the DNA encoding the variant must notdisrupt the reading frame and preferably will not create complementaryregions that could produce secondary mRNA structure. See, EP PatentApplication Publication No. 75,444. Guidance as to appropriate aminoacid substitutions that do not affect biological activity of the proteinof interest may be found in the model of Dayhoff et al. (1978) Atlas ofProtein Sequence and Structure (Natl. Biomed. Res. Found., Washington,D.C.), herein incorporated by reference in its entirety for theseteachings.

The deletions, insertions, and substitutions of the amino acid sequencesencompassed herein are not expected to produce radical changes in thecharacteristics of the protein. However, when it is difficult to predictthe exact effect of the substitution, deletion, or insertion in advanceof doing so, one skilled in the art will appreciate that the effect willbe evaluated by routine screening assays. That is, the activity can beevaluated by comparing the activity of the modified sequence with theactivity of the original sequence. See, for example, Baruah et al.(2004) J. Bacteriol. 186:1694-1704; Wang et al. (2001) J. Bacteriol.183:2795-2802; and, Piazza et al. (1999) J. Bacteriol. 181:4540-4548),each of which is herein incorporated by reference in their entiretiesfor these teachings.

Variant nucleotide and amino acid sequences of the present inventionalso encompass sequences derived from mutagenic and recombinogenicprocedures such as DNA shuffling. With such a procedure, one or moredifferent polypeptides of the invention can be used to create a newpolypeptide possessing the desired properties. In this manner, librariesof recombinant polynucleotides are generated from a population ofrelated sequence polynucleotides comprising sequence regions that havesubstantial sequence identity and can be homologously recombined invitro or in vivo. For example, using this approach, sequence motifsencoding a domain of interest can be shuffled between the two-componentregulatory nucleic acid of the invention and other known two-componentregulatory nucleic acid to obtain a new nucleic acid encoding for apeptide, polypeptide or protein with an improved property of interest,such as an increased K_(m) in the case of an enzyme. Strategies for suchDNA shuffling are known in the art. See, for example, Stemmer (1994)Proc. Natl. Acad. Sci. USA 91:10747-10751; Stemmer (1994) Nature370:389-391; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore etal. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) Proc. Natl.Acad. Sci. USA 94:4504-4509; Crameri et al. (1998) Nature 391:288-291;and U.S. Pat. Nos. 5,605,793 and 5,837,458.

Variants of the two-component regulatory proteins can function as eithertwo-component-related agonists (mimetics) or as two-component-relatedantagonists. An agonist of the two-component-related protein can retainsubstantially the same, or a subset, of the biological activities of thenaturally occurring form of the two-component regulatory protein. Anantagonist of the two-component regulatory protein can inhibit one ormore of the activities of the naturally occurring form of thetwo-component regulatory protein by, for example, competitively bindingto a downstream or upstream member of a cellular signaling cascade thatincludes the two-component regulatory protein.

Variants of a two-component regulatory protein or variants ofpolypeptides under the control of the two-component sensing andregulatory molecules that function as either agonists or antagonists canbe identified by screening combinatorial libraries of mutants, e.g.,truncation mutants, of a two-component regulatory protein forstress-related protein agonist or antagonist activity. In oneembodiment, a variegated library of two-component regulatory variants isgenerated by combinatorial mutagenesis at the nucleic acid level and isencoded by a variegated gene library. A variegated library oftwo-component regulatory variants can be produced by, for example,enzymatically ligating a mixture of synthetic oligonucleotides into genesequences such that a degenerate set of potential two-componentregulatory sequences is expressible as individual polypeptides, oralternatively, as a set of larger fusion proteins (e.g., for phagedisplay) containing the set of two-component regulatory sequencestherein. There are a variety of methods that can be used to producelibraries of variants from a degenerate oligonucleotide sequence.Chemical synthesis of a degenerate gene sequence can be performed in anautomatic DNA syntheizer, and the synthetic gene then ligated into anappropriate expression vector. Use of a degenerate set of genes allowsfor the provision, in one mixture, of all of the sequences encoding thedesired set of potential two-component regulatory sequences. Methods forsynthesizing degenerate oligonucleotides are known in the art (see,e.g., Narang (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu. Rev.Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al.(1983) Nucleic Acids Res. 11:477).

In addition, libraries of fragments of a two-component regulatoryprotein coding sequence can be used to generate a variegated populationof two-component regulatory fragments for screening and subsequentselection of variants of a two-component regulatory protein. In oneembodiment, a library of coding sequence fragments can be generated bytreating a double-stranded PCR fragment of a two-component regulatorycoding sequence with a nuclease under conditions wherein nicking occursonly about once per molecule, denaturing the double-stranded DNA,renaturing the DNA to form double-stranded DNA which can includesense/antisense pairs from different nicked products, removingsingle-stranded portions from reformed duplexes by treatment with Sinuclease, and ligating the resulting fragment library into an expressionvector. By this method, one can derive an expression library thatencodes N-terminal and internal fragments of various sizes of theprotein.

Several techniques are known in the art for screening gene products ofcombinatorial libraries made by point mutations or truncation and forscreening cDNA libraries for gene products having a selected property.Such techniques are adaptable for rapid screening of the gene librariesgenerated by the combinatorial mutagenesis of proteins. The most widelyused techniques, which are amenable to high through-put analysis, forscreening large gene libraries typically include cloning the genelibrary into replicable expression vectors, transforming appropriatecells with the resulting library of vectors, and expressing thecombinatorial genes under conditions in which detection of a desiredactivity facilitates isolation of the vector encoding the gene whoseproduct was detected. Recursive ensemble mutagenesis (REM), a techniquethat enhances the frequency of functional mutants in the libraries, canbe used in combination with the screening assays to identifytwo-component regulatory variants (Arkin and Yourvan (1992) Proc. Natl.Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering6(3):327-331).

Sequence Identity

The two-component regulatory sequences and the sequences under thecontrol of two-component sensing and regulatory molecules are members ofvarious families of molecules with conserved functional features. By“family” is intended two or more proteins or nucleic acid moleculeshaving sufficient nucleotide or amino acid sequence identity. By“sequence identity” is intended the nucleotide or amino acid residuesthat are the same when aligning two sequences for maximum correspondenceover a specified comparison window. By “comparison window” is intended acontiguous segment of the two nucleotide or amino acid sequences foroptimal alignment, wherein the second sequence can contain additions ordeletions (i.e., gaps) as compared to the first sequence. Generally, fornucleic acid alignments, the comparison window is at least 20 contiguousnucleotides in length, and optionally can be 30, 40, 50, 100, or longer.For amino acid sequence alignments, the comparison window is at least 6contiguous amino acids in length, and optionally can be 10, 15, 20, 30,or longer. Those of skill in the art understand that to avoid a highsimilarity due to inclusion of gaps, a gap penalty is typicallyintroduced and is subtracted from the number of matches.

Family members can be from the same or different species, and caninclude homologues as well as distinct proteins. Often, members of afamily display common functional characteristics. Homologues can beisolated based on their identity to the Lactobacillus acidophilusnucleic acid sequences disclosed herein using the cDNA, or a portionthereof, as a hybridization probe according to standard hybridizationtechniques under stringent hybridization conditions as disclosed herein.

To determine the percent identity of two amino acid or nucleotidesequences, an alignment is performed. Percent identity of the twosequences is a function of the number of identical residues shared bythe two sequences in the comparison window (i.e., percentidentity=number of identical residues/total number of residues×100). Inone embodiment, the sequences are the same length. Methods similar tothose mentioned below can be used to determine the percent identitybetween two sequences. The methods can be used with or without allowinggaps. Alignment can also be performed manually by inspection.

When amino acid sequences differ in conservative substitutions, thepercent identity can be adjusted upward to correct for the conservativenature of the substitution. Means for making this adjustment are knownin the art. Typically the conservative substitution is scored as apartial, rather than a full mismatch, thereby increasing the percentagesequence identity.

Mathematical algorithms can be used to determine the percent identity oftwo sequences. Non-limiting examples of mathematical algorithms are thealgorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA87:2264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad.Sci. USA 90:5873-5877; the algorithm of Myers and Miller (1988) CABIOS4:11-17; the local alignment algorithm of Smith et al. (1981) Adv. Appl.Math. 2:482; the global alignment algorithm of Needleman and Wunsch(1970) J. Mol. Biol. 48:443-453; and the search-for-localalignment-method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA85:2444-2448.

Various computer implementations based on these mathematical algorithmshave been designed to enable the determination of sequence identity. TheBLAST programs of Altschul et al. (1990) J. Mol. Biol. 215:403 are basedon the algorithm of Karlin and Altschul (1990) supra. Searches to obtainnucleotide sequences that are homologous to nucleotide sequences of thepresent invention can be performed with the BLASTN program, score=100,wordlength=12. To identify amino acid sequences homologous to amino acidsequences of the proteins of the present invention, the BLASTX programcan be used, score=50, wordlength=3. Gapped alignments can be obtainedby using Gapped BLAST (in BLAST 2.0) as described in Altschul et al.(1997) Nucleic Acids Res. 25:3389. To detect distant relationshipsbetween molecules, PSI-BLAST can be used. See Altschul et al. (1997)supra. For all of the BLAST programs, the default parameters of therespective programs can be used. Alignment can also be performedmanually by inspection.

Another program that can be used to determine percent sequence identityis the ALIGN program (version 2.0), which uses the mathematicalalgorithm of Myers and Miller (1988) supra. A PAM120 weight residuetable, a gap length penalty of 12, and a gap penalty of 4 can be usedwith this program when comparing amino acid sequences.

In addition to the ALIGN and BLAST programs, the BESTFIT, GAP, FASTA andTFASTA programs are part of the GCG Wisconsin Genetics Software Package,Version 10 (available from Accelrys Inc., 9685 Scranton Rd., San Diego,Calif., USA), and can be used for performing sequence alignments. Thepreferred program is GAP version 10, which used the algorithm ofNeedleman and Wunsch (1970) supra. Unless otherwise stated, the sequenceidentity similarity values provided herein refer to the value obtainedusing GAP Version 10 with the following parameters: % identity and %similarity for a nucleotide sequence using GAP Weight of 50 and LengthWeight of 3 and the nwsgapdna.cmp scoring matrix; % identity and %similarity for an amino acid sequence using GAP Weight of 8 and LengthWeight of 2, and the BLOSUM62 scoring matrix; or any equivalent program.By “equivalent program” is intended any sequence comparison programthat, for any two sequences in question, generates an alignment havingidentical nucleotide or amino acid residue matches and an identicalpercent sequence identity when compared to the corresponding alignmentgenerated by GAP Version 10

Identification and Isolation of Homologous Sequences

Two-component regulatory nucleotide sequences or proteins under thecontrol of two-component sensing and regulatory molecules identifiedbased on their sequence identity to the sequences set forth herein or tofragments and variants thereof are encompassed by the present invention.Methods such as PCR or hybridization can be used to identify sequencesfrom a cDNA or genomic library, for example, that are substantiallyidentical to a sequence of the invention. See, for example, Sambrook etal. (1989) Molecular Cloning: Laboratory Manual (2d ed., Cold SpringHarbor Laboratory Press, Plainview, N.Y.) and Innis, et al. (1990) PCRProtocols: A Guide to Methods and Applications (Academic Press, NewYork). Methods for construction of such cDNA and genomic libraries aregenerally known in the art and are also disclosed in the abovereference.

In hybridization techniques, the hybridization probes can be genomic DNAfragments, cDNA fragments, RNA fragments, and/or other oligonucleotides,and can consist of all or part of a known nucleotide sequence disclosedherein. In addition, they can be labeled with a detectable group such as³²P, or any other detectable marker, such as other radioisotopes, afluorescent compound, an enzyme, or an enzyme co-factor. Probes forhybridization can be made by labeling synthetic oligonucleotides basedon the known two-component regulatory nucleotide sequences disclosedherein. Degenerate primers designed on the basis of conservednucleotides or amino acid residues in a known two-component regulatorynucleotide sequence or encoded amino acid sequence can additionally beused. The hybridization probe typically comprises a region of nucleotidesequence that hybridizes under stringent conditions to at least about10, or about 20, or about 50, 75, 100, 125, 150, 175, 200, 250, 300,350, or 400 consecutive nucleotides of a two-component regulatorynucleotide sequence of the invention or a fragment or variant thereof.To achieve specific hybridization under a variety of conditions, suchprobes include sequences that are unique among two-component regulatoryprotein sequences or unique among proteins under the control oftwo-component sensing and regulatory molecules. Preparation of probesfor hybridization is generally known in the art and is disclosed inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed.,Cold Spring Harbor Laboratory Press, Plainview, N.Y.), hereinincorporated by reference in its entirety for these teachings.

In one embodiment, the entire nucleotide sequence of the invention isused as a probe to identify novel sequences and messenger RNAs. Inanother embodiment, the probe is a fragment of a nucleotide sequencedisclosed herein. In some embodiments, the nucleotide sequence thathybridizes under stringent conditions to the probe can be at least about300, 325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900,1000, 1500, 2000, 2500, 3000, 3500, or 4000 nucleotides in length(including any value not explicitly stated herein).

Substantially identical sequences will hybridize to each other understringent conditions. By “stringent conditions” is meant conditionsunder which a probe will hybridize to its target sequence to adetectably greater degree than to other sequences (e.g., at least 2-foldover background). Generally, stringent conditions encompass thoseconditions for hybridization and washing under which nucleotides havingat least about 60%, 65%, 70%, preferably 75% sequence identity typicallyremain hybridized to each other. Stringent conditions (e.g., high,medium, low stringency) are known in the art and can be found in CurrentProtocols in Molecular Biology (John Wiley & Sons, New York (1989)),6.3.1-6.3.6, the entire contents of which are incorporated herein byreference for these teachings. Hybridization typically occurs for lessthan about 24 hours, usually about 4 to about 12 hours.

Stringent conditions are sequence dependent and will differ in differentcircumstances. When using probes, stringent conditions can be, e.g.,those in which the salt concentration is less than about 1.5 M Na ion,typically about 0.01 to 1.0 M Na ion concentration (or other salts) atpH 7.0 to 8.3 and the temperature is at least about 30° C. for shortprobes (e.g., 10 to 50 nucleotides) and at least about 60° C. for longprobes (e.g., greater than 50 nucleotides).

The post-hybridization washes are instrumental in controllingspecificity. The two factors are ionic strength and temperature of thefinal wash solution. For the detection of sequences that hybridize to afull-length or approximately full-length target sequence, thetemperature under stringent conditions is selected to be about 5° C.lower than the thermal melting point (T_(m)) for the specific sequenceat a defined ionic strength and pH. However, stringent conditions wouldencompass temperatures in the range of 1° C. to 20° C. lower than theT_(m), depending on the desired degree of stringency as otherwisequalified herein. For DNA-DNA hybrids, the T_(m) can be determined usingthe equation of Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284:T_(m)=81.5° C.+16.6(logM)+0.41(% GC)−0.61(% form)−500/L; where M is themolarity of monovalent cations, % GC is the percentage of guanosine andcytosine nucleotides in the DNA, % form is the percentage of formamidein the hybridization solution, and L is the length of the hybrid in basepairs. The T_(m) is the temperature (under defined ionic strength andpH) at which 50% of a complementary target sequence hybridizes to aperfectly matched probe.

The ability to detect sequences with varying degrees of homology can beobtained by varying the stringency of the hybridization and/or washingconditions. To target sequences that are 100% identical (homologousprobing), stringency conditions must be obtained that do not allowmismatching. By allowing mismatching of nucleotide residues to occur,sequences with a lower degree of similarity can be detected(heterologous probing). For every 1% of mismatching, the T_(m) isreduced about 1° C.; therefore, hybridization and/or wash conditions canbe manipulated to allow hybridization of sequences of a targetpercentage identity. For example, if sequences with ≧90% sequenceidentity are preferred, the T_(m) can be decreased by 10° C. Twonucleotide sequences could be substantially identical, but fail tohybridize to each other under stringent conditions, if the polypeptidesthey encode are substantially identical. This situation could arise, forexample, if the maximum codon degeneracy of the genetic code is used tocreate a copy of a nucleic acid.

Exemplary low stringency conditions include hybridization with a buffersolution of 30-35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulfate)at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodiumcitrate) at 50 to 55° C. Exemplary moderate stringency conditionsinclude hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37°C., and a wash in 0.5× to 1×SSC at 55 to 60° C. Exemplary highstringency conditions include hybridization in 50% formamide, 1 M NaCl,1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65° C. Optionally, washbuffers can comprise about 0.1% to about 1% SDS. Duration ofhybridization is generally less than about 24 hours, and is usuallyabout 4 to about 12 hours. An extensive guide to the hybridization ofnucleic acids is found in Tijssen (1993) Laboratory Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic AcidProbes, Part I, Chapter 2 (Elsevier, New York); and Ausubel et al., eds.(1995) Current Protocols in Molecular Biology, Chapter 2 (GreenePublishing and Wiley-Interscience, New York). See Sambrook et al. (1989)Molecular Cloning: A Laboratory Manual (2d ed.; Cold Spring HarborLaboratory Press, Plainview, N.Y.), the entire contents of which areincorporated herein by reference for these teachings.

In a PCR approach, oligonucleotide primers can be designed for use inPCR reactions to amplify corresponding DNA sequences from cDNA orgenomic DNA extracted from any organism of interest. PCR primers can bepreferably at least about 10 nucleotides in length, or at least about 20nucleotides in length. Methods for designing PCR primers and PCR cloningare generally known in the art and are disclosed in Sambrook et al.(1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold SpringHarbor Laboratory Press, Plainview, N.Y.). See also Innis et al., eds.(1990) PCR Protocols: A Guide to Methods and Applications (AcademicPress, New York); Innis and Gelfand, eds. (1995) PCR Strategies(Academic Press, New York); and Innis and Gelfand, eds. (1999) PCRMethods Manual (Academic Press, New York), the entire contents of whichare incorporated herein by reference for these teachings. Known methodsof PCR include, but are not limited to, methods using paired primers,nested primers, single specific primers, degenerate primers,gene-specific primers, vector-specific primers, partially-mismatchedprimers, and the like.

Assays

Diagnostic assays to detect expression of the peptides, polypeptidesand/or nucleic acid molecules of this invention, as well as, theirdisclosed activity in a sample are disclosed. An exemplary method fordetecting the presence or absence of a nucleic acid or protein of thisinvention in a sample comprises obtaining a sample from afood/dairy/feed product, starter culture (mother, seed, bulk/set,concentrated, dried, lyophilized, frozen), cultured food/dairy/feedproduct, dietary supplement, bioprocessing fermentate, a subject (e.g.,a subject that has ingested a probiotic material), etc., and contactingthe sample with a compound or an agent that interacts with or combineswith the peptides, polypeptides or nucleic acids of this invention in adetectable manner (e.g., an mRNA or genomic DNA comprising the disclosednucleic acid or fragment thereof) such that the presence of the peptideor nucleic acid is detected in the sample. Results obtained with asample from the food, supplement, culture, product, or subject can becompared to results obtained with a sample from a control culture,product, or subject and a qualitative and/or quantitative determinationof the presence of a polypeptide or nucleic acid of this invention inthe sample can be made.

One agent for detecting the mRNA and/or genomic DNA comprising adisclosed nucleotide sequence of this invention is a labeled nucleicacid probe capable of hybridizing to the nucleotide sequence present inthe mRNA and/or genomic DNA. The nucleic acid probe can be, for example,a disclosed nucleic acid molecule, such as the nucleic acid of oddnumbered SEQ ID NOS:1-164, or a fragment thereof, such as a nucleic acidmolecule of at least 15, 30, 50, 100, 250, or 500 nucleotides in lengthand sufficient to specifically hybridize under stringent conditions tothe mRNA or genomic DNA comprising the disclosed nucleic acid sequence.Other suitable probes for use in the diagnostic assays of the inventionare described herein.

One agent for detecting a protein of this invention is an antibody orligand that specifically binds a peptide or protein of this invention.In some embodiments, the antibody or ligand can comprise a detectablelabel. Antibodies of this invention can be polyclonal, or monoclonal. Anintact antibody, or a fragment thereof (e.g., Fab or F(abN)₂) can beused. The term “labeled,” with regard to the probe or antibody, isintended to encompass direct labeling of the probe or antibody bycoupling (i.e., physically linking) a detectable substance to the probeor antibody, as well as indirect labeling of the probe or antibody, byreactivity with another reagent that is directly labeled. Examples ofindirect labeling include detection of a primary antibody using afluorescently labeled secondary antibody and end-labeling of a DNA probewith biotin such that it can be detected with fluorescently labeledstreptavidin.

An isolated peptide, polypeptide or protein of the present invention canbe used as an antigen or immunogen to generate antibodies thatspecifically bind two-component regulatory proteins or proteins underthe control of two-component sensing and regulatory molecules orgenerate antibodies that stimulate production of antibodies in vivo. Thefull-length polypeptide of the invention can be used as an immunogen or,alternatively, antigenic peptide fragments. The antigenic peptide cancomprise at least 8, 10, 15, 20, or 30 or more amino acid residues ofthe amino acid sequence shown in even numbered SEQ ID NOS:1-164 andencompasses an epitope of a two-component regulatory protein or aprotein under the control of two-component sensing and regulatorymolecules such that an antibody raised against the peptide forms aspecific immune complex with the protein or fragment thereof. An epitopeencompassed by the antigenic peptide can comprise are regions of aprotein that are located on the surface of the protein, e.g., ahydrophilic region.

The term “sample” is intended to include tissues, cells, and biologicalfluids present in or isolated from a subject, as well as cells fromstarter cultures or food products carrying such cultures, or derivedfrom the use of such cultures. That is, the detection method of theinvention can be used to detect mRNA, protein, or genomic DNA comprisinga nucleic acid molecule or amino acid sequence of this invention in asample both in vitro and in vivo. In vitro techniques for detection ofmRNA comprising a disclosed sequence include, but are not limited to,Northern hybridizations and in situ hybridizations. In vitro techniquesfor detection of a protein comprising a disclosed polypeptide include,but are not limited to, enzyme linked immunosorbent assays (ELISAs),Western blots, immunoprecipitations, and immunofluorescence. In vitrotechniques for detection of genomic DNA comprising the disclosednucleotide sequences include, but are not limited to, Southernhybridizations. Furthermore, in vivo techniques for detection of aprotein of this invention include introducing into a subject a labeledantibody or ligand that specifically binds the protein. For example, theantibody or ligand can be labeled with a radioactive marker whosepresence and location in a subject can be detected by standard imagingtechniques.

In one embodiment, the sample of this invention comprises proteinmolecules from a subject that has consumed a probiotic material.Alternatively, the sample can contain mRNA or genomic DNA from a starterculture.

The invention also encompasses kits for detecting the presence of thenucleic acids or proteins of this invention in a sample. Such kits canbe used to determine if a microbe producing a specific polypeptide ofthe invention is present in a food product or starter culture, or in asubject that has consumed a probiotic material. For example, the kit cancomprise a labeled compound or agent capable of detecting a disclosedpolypeptide or mRNA in a sample and means for determining the amount ofa the disclosed polypeptide in the sample (e.g., an antibody or ligandthat specifically binds the disclosed polypeptide or nucleic acid probethat hybridizes with nucleic acid sequences encoding a disclosedpolypeptide, e.g., odd numbered SEQ ID NOS:1-164). Kits can also includeinstructions detailing the use of such compounds.

For antibody-based kits, the kit can comprise, for example: (1) a firstantibody (e.g., attached to a solid support) that binds to a disclosedpolypeptide; and, optionally, (2) a second, different antibody thatbinds to the disclosed polypeptide or the first antibody and isconjugated to a detectable agent. For nucleic acid-based kits, the kitcan comprise, for example: (1) a nucleic acid molecule, e.g., adetectably labeled oligonucleotide, that hybridizes to a disclosednucleic acid sequence or (2) a pair of primers useful for amplifying adisclosed nucleic acid molecule.

The kit can also comprise, e.g., a buffering agent, a preservative,and/or a protein stabilizing agent. The kit can also comprise componentsnecessary for detecting the detectable agent (e.g., an enzyme or asubstrate). The kit can also contain a control sample or a series ofcontrol samples that can be assayed and compared to the test sample.Each component of the kit can be enclosed within an individualcontainer, and all of the various containers can be within a singlepackage along with instructions for use.

In one embodiment, the kit comprises multiple probes in an array format,such as those described, for example, in U.S. Pat. Nos. 5,412,087 and5,545,531, and International Publication No. WO 95/00530, hereinincorporated by reference in their entireties. Probes for use in thearray can be synthesized either directly onto the surface of the array,as disclosed in International Publication No. WO 95/00530, or prior toimmobilization onto the array surface (Gait, ed. (1984), OligonucleotideSynthesis a Practical Approach IRL Press Oxford, England). The probescan be immobilized onto the surface using techniques well known to oneof skill in the art, such as those described in U.S. Pat. No. 5,412,087.Probes can be a nucleic acid or amino acid sequence, preferablypurified, or an antibody.

The arrays can be used to screen organisms, samples, or products fordifferences in their genomic, cDNA, polypeptide, or antibody content,including the presence or absence of specific sequences or proteins, aswell as the concentration of those materials. Binding to a capture probeis detected, for example, by signal generated from a label attached tothe nucleic acid molecule comprising the disclosed nucleic acidsequence, a polypeptide comprising the disclosed amino acid sequence, oran antibody. The method can include contacting the molecule comprisingthe disclosed nucleic acid, polypeptide, or antibody with a first arrayhaving a plurality of capture probes and a second array having adifferent plurality of capture probes. The results of each hybridizationcan be compared to analyze differences in expression between a first andsecond sample. The first plurality of capture probes can be from acontrol sample, e.g., a wild type lactic acid bacteria, or controlsubject, e.g., a food, dietary supplement, starter culture sample or abiological fluid. The second plurality of capture probes can be from anexperimental sample, e.g., a mutant type lactic acid bacteria, orsubject that has consumed a probiotic material, e.g., a starter culturesample, or a biological fluid.

These assays can be especially useful in microbial selection and qualitycontrol procedures where the detection of unwanted materials isessential. The detection of particular nucleotide sequences orpolypeptides can also be useful in determining the genetic compositionof food, fermentation products, or industrial microbes, or microbespresent in the digestive system of animals or humans that have consumedprobiotics.

The present invention further provides a nucleic acid array or chip,i.e., a multitude of nucleic acids (e.g., DNA) as molecular probesprecisely organized or arrayed on a solid support, which allow for thesequencing of genes, the study of mutations contained therein and/or theanalysis of the expression of genes, as such arrays and chips arecurrently of interest given their very small size and their highcapacity in terms of number of analyses.

For an analysis, the carrier, such as in a DNA array/chip, is coatedwith DNA probes (e.g., oligonucleotides) that are arranged at apredetermined location or position on the carrier. A sample containing atarget nucleic acid and/or fragments thereof to be analyzed, for exampleDNA or RNA or cDNA, that has been labeled beforehand, is contacted withthe DNA array/chip leading to the formation, through hybridization, of aduplex. After a washing step, analysis of the surface of the chip allowsany hybridizations to be located by means of the signals emitted by thelabeled target. A hybridization fingerprint results, which, by computerprocessing, allows retrieval of information such as the expression ofgenes, the presence of specific fragments in the sample, thedetermination of sequences and/or the identification of mutations.

In one embodiment of this invention, hybridization between targetnucleic acids and nucleic acids of the invention, used in the form ofprobes and deposited or synthesized in situ on a DNA chip/array, can bedetermined by means of fluorescence, radioactivity, electronic detectionor the like, as are well known in the art.

In another embodiment, the nucleotide sequences of the invention can beused in the form of a DNA array/chip to carry out analyses of theexpression of Lactobacillus acidophilus genes. This analysis is based onDNA array/chips on which probes, chosen for their specificity tocharacterize a given gene or nucleotide sequence, are present. Thetarget sequences to be analyzed are labeled before being hybridized ontothe chip. After washing, the labeled complexes are detected andquantified. Comparative analyses of the signal intensities obtained withrespect to the same probe for different samples and/or for differentprobes with the same sample, allows, for example, for differentialtranscription of RNA derived from the sample.

In yet another embodiment, arrays/chips containing nucleotide sequencesof the invention can comprise nucleotide sequences specific for othermicroorganisms, which allows for serial testing and rapid identificationof the presence of a microorganism in a sample.

In a further embodiment, the principle of the DNA array/chip can also beused to produce protein arrays/chips on which the support has beencoated with a polypeptide and/or an antibody of this invention, orarrays thereof, in place of the nucleic acid. These protein arrays/chipsmake it possible, for example, to analyze the biomolecular interactionsinduced by the affinity capture of targets onto a support coated, e.g.,with proteins, by surface plasma resonance (SPR). The polypeptides orantibodies of this invention, capable of specifically binding antibodiesor polypeptides derived from the sample to be analyzed, can be used inprotein arrays/chips for the detection and/or identification of proteinsand/or peptides in a sample.

Thus, the present invention provides a microarray or microchipcomprising various nucleic acids of this invention in any combination,including repeats, as well as a microarray comprising variouspolypeptides of this invention in any combination, including repeats.Also provided is a microarray comprising one or more antibodies thatspecifically react with various polypeptides of this invention, in anycombination, including repeats.

Antisense Nucleotide Sequences

The present invention also encompasses antisense nucleic acid molecules,i.e., molecules that are complementary to a sense nucleic acid encodinga protein, e.g., complementary to the coding strand of a double-strandedcDNA molecule, or complementary to an mRNA sequence. Accordingly, anantisense nucleic acid can hydrogen bond to a sense nucleic acid. Theantisense nucleic acid can be complementary to an entire sequence, or toonly a portion thereof, e.g., all or part of the protein coding region(or open reading frame). An antisense nucleic acid molecule can beantisense to a noncoding region of the coding strand of a nucleotidesequence of the invention. The noncoding regions are the 5′ and 3′sequences that flank the coding region and are not translated into aminoacids. Antisense nucleotide sequences are useful in disrupting theexpression of the target gene. Antisense constructions having 70%, 80%,or 85% sequence identity to the corresponding sequence can be used.

Given the coding-strand sequence encoding a protein disclosed herein(e.g., odd numbered SEQ ID NOS:1-164), antisense nucleic acids of theinvention can be designed according to the rules of Watson and Crickbase pairing. The antisense nucleic acid molecule can be complementaryto the entire coding region of the mRNA, but can also be anoligonucleotide that is antisense to only a portion of the coding ornoncoding region of the mRNA. An antisense oligonucleotide can be, forexample, about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides inlength, or it can be 100, 200 nucleotides, or greater in length,including any value in between those listed herein. An antisense nucleicacid of the invention can be constructed using chemical synthesis andenzymatic ligation procedures known in the art.

An antisense nucleic acid molecule of the invention can be an α-anomericnucleic acid molecule (Gaultier et al. (1987) Nucleic Acids Res.15:6625-6641). The antisense nucleic acid molecule can also comprise a2′-O-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res.15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBSLett. 215:327-330). The invention also encompasses ribozymes, which arecatalytic RNA molecules with ribonuclease activity that are capable ofcleaving a single-stranded nucleic acid, such as an mRNA, to which theyhave a complementary region. The invention also encompasses nucleic acidmolecules that form triple helical structures. See generally Helene(1991) Anticancer Drug Des. 6(6):569; Helene (1992) Ann. N.Y. Acad. Sci.660:27; and Maher (1992) Bioassays 14(12):807, the entire contents ofeach of which are incorporated herein by reference for these teachings.

In some embodiments, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety, or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. As used herein, the terms “peptide nucleic acids” or “PNAs”refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribosephosphate backbone is replaced by a pseudopeptide backbone and only thefour natural nucleobases are retained. The neutral backbone of PNAs hasbeen shown to allow for specific hybridization to DNA and RNA underconditions of low ionic strength. The synthesis of PNA oligomers can beperformed using standard solid-phase peptide synthesis protocols asdescribed, for example, in Hyrup et al. (1996) supra; Perry-O'Keefe etal. (1996) Proc. Natl. Acad. Sci. USA 93:14670, the entire contents ofeach of which are incorporated herein by reference for these teachings.

In another embodiment, PNAs of a sequence can be modified, e.g., toenhance stability, specificity, or cellular uptake, by attachinglipophilic or other helper groups to PNA, by the formation of PNA-DNAchimeras, or by the use of liposomes or other techniques of drugdelivery known in the art. The synthesis of PNA-DNA chimeras can beperformed as described in Hyrup (1996) supra; Finn et al. (1996) NucleicAcids Res. 24(17):3357-3363; Mag et al. (1989) Nucleic Acids Res.17:5973; and Peterson et al. (1975) Bioorganic Med. Chem. Lett. 5:1119,the entire contents of each of which are incorporated herein byreference for these teachings.

Fusion Proteins

The invention also includes chimeric or fusion proteins. A “chimericprotein” or “fusion protein” of this invention comprises a peptide orpolypeptide as described herein operably linked (e.g., in frame) to aheterologous peptide or polypeptide. “Heterologous” in reference to asequence is a sequence that originates from a foreign species, or, iffrom the same species, is substantially modified from its native form incomposition and/or genomic locus by deliberate human intervention. Forexample, a promoter operably linked to a heterlogous polypeptide is froma species different from the species from which the polynucleotide wasderived, or, if from the same/analogous species, one or both aresubstantially from their original genomic locus, or the promoter is notthe native promoter for the operably linked polynucleotide. A“heterologous peptide or polypeptide” refers to a peptide or polypeptidehaving an amino acid sequence corresponding to a protein that is notsubstantially identical to the amino acid sequence or protein of thisinvention, and which is derived from the same or a different organism.Within a fusion protein of this invention, the two-component regulatorypeptide or polypeptide or the protein under the control of two-componentsensing and regulatory molecules can comprise all or a portion of apolypeptide of the invention, preferably including at least onebiologically active portion of the polypeptide. Within the fusionprotein, the term “linked” is intended to indicate that thetwo-component regulatory peptide or polypeptide or the protein under thecontrol of two-component sensing and regulatory molecules and theheterologous peptide or polypeptide are fused or joined or connectedin-frame to each other. The heterologous peptide or polypeptide can befused to the N-terminus and/or C-terminus of a peptide or polypeptide ofthis invention.

Expression of the linked coding sequences (e.g., a nucleotide sequenceencoding the peptide or polypeptide of the invention linked in framewith a nucleotide sequence encoding the heterologous peptide orpolypeptide) in some embodiments results in production of the fusionprotein. The heterologous sequence can be a polypeptide that potentiatesor increases production of the fusion protein in a cell. The portion ofthe fusion protein encoded by the heterologous sequence, i.e., theheterologous polypeptide, can be a protein fragment or peptide, anentire functional moiety, or an entire protein sequence. Theheterologous peptide or polypeptide can be designed to be used inpurifying the fusion protein, either with antibodies or with affinitypurification specific for the heterologous polypeptide. Likewise,physical properties of the heterologous polypeptide can be exploited toallow selective purification of the fusion protein. Particularheterologous polypeptides of interest include superoxide dismutase(SOD), maltose-binding protein (MBP), glutathione-S-transferase (GST),an N-terminal histidine (His) tag, GST, immunoglobulin, and the like.This list is not intended to be limiting, as any heterologouspolypeptide (e.g., a protein that potentiates production of thetwo-component regulatory protein as a fusion protein can be used in thecompositions and methods of the invention. In one embodiment, the fusionprotein is a GST-two-component regulatory fusion protein in which thetwo-component regulatory sequences are fused to the C-terminus of theGST sequences. In another embodiment, the fusion protein is atwo-component regulatory-immunoglobulin fusion protein in which all orpart of a two-component regulatory protein is fused to sequences derivedfrom a member of the immunoglobulin protein family.

The immunoglobulin fusion proteins of the invention can be used asimmunogens to produce antibodies in a subject to purify ligands, and inscreening assays to identify molecules that inhibit the interaction of aprotein of the invention with a ligand.

One of skill in the art will recognize that the particular heterologouspolypeptide is chosen with the purification scheme in mind. For example,His tags, GST, and maltose-binding protein represent heterologouspolypeptides that have readily available affinity columns to which theycan be bound and eluted. Thus, where the heterologous polypeptide is anN-terminal His tag such as hexahistidine (His₆ tag), the two-componentregulatory fusion protein can be purified using a matrix comprising ametal-chelating resin, for example, nickel nitrilotriacetic acid(Ni-NTA), nickel iminodiacetic acid (Ni-IDA), and cobalt-containingresin (Co-resin). See, for example, Steinert et al. (1997) QIAGEN News4:11-15, herein incorporated by reference in its entirety for theseteachings. Where the heterologous polypeptide is GST, the fusion proteincan be purified using a matrix comprising glutathione-agarose beads(Sigma or Pharmacia Biotech); where the heterologous polypeptide is amaltose-binding protein (MBP), the fusion protein can be purified usinga matrix comprising an agarose resin derivatized with amylose.

Preferably, a chimeric or fusion protein of the invention is produced bystandard recombinant DNA techniques. For example, nucleic acid fragmentscoding for the different polypeptide sequences can be ligated togetherin-frame, or the fusion nucleic acid can be synthesized, such as withautomated DNA synthesizers. Alternatively, PCR amplification of genefragments can be carried out using anchor primers that give rise tocomplementary overhangs between two consecutive nucleic fragments, whichcan subsequently be annealed and re-amplified to generate a chimericnucleic acid sequence (see, e.g., Ausubel et al., eds. (1995) CurrentProtocols in Molecular Biology (Greene Publishing andWiley-Interscience, New York). Moreover, the sequences of the inventioncan be cloned into a commercially available expression vector such thatit is linked in-frame to an existing fusion moiety. Thus, the presentinvention also provides a vector comprising a nucleic acid encoding afusion protein of this invention.

A fusion protein expression vector is typically designed for ease ofremoving the heterologous polypeptide to allow the two-componentregulatory protein or the protein under the control of two-componentsensing and regulatory molecules to retain the native biologicalactivity associated with it. Methods for cleavage of fusion proteins areknown in the art. See, for example, Ausubel et al., eds. (1998) CurrentProtocols in Molecular Biology (John Wiley & Sons, Inc.). Chemicalcleavage of the fusion protein can be accomplished with reagents such ascyanogen bromide,2-(2-nitrophenylsulphenyl)-3-methyl-3′-bromoindolenine, hydroxylamine,or low pH. Chemical cleavage is often accomplished under denaturingconditions to cleave otherwise insoluble fusion proteins.

Where separation of the polypeptide from the heterologous polypeptide isdesired and a cleavage site at the junction between these fusedpolypeptides is not naturally occurring, the fusion construct can bedesigned to contain a specific protease cleavage site to facilitateenzymatic cleavage and removal of the heterologous polypeptide. In thismanner, a linker sequence comprising a coding sequence for a peptidethat has a cleavage site specific for an enzyme of interest can be fusedin-frame between the coding sequence for the heterologous polypeptide(for example, MBP, GST, SOD, or an N-terminal His tag) and the codingsequence for the two-component regulatory polypeptide. Suitable enzymeshaving specificity for cleavage sites include, but are not limited to,factor Xa, thrombin, enterokinase, remin, collagenase, and tobacco etchvirus (TEV) protease. Cleavage sites for these enzymes are well known inthe art. Thus, for example, where factor Xa is to be used to cleave theheterologous polypeptide from the two-component regulatory polypeptide,the fusion construct can be designed to comprise a linker sequenceencoding a factor Xa-sensitive cleavage site, for example, the sequenceIEGR (see, for example, Nagai and Thøgersen (1984) Nature 309:810-812,Nagai and Thøgersen (1987) Meth. Enzymol. 153:461-481, and Pryor andLeiting (1997) Protein Expr. Pur 10(3):309-319, herein incorporated byreference). Where thrombin is to be used to cleave the heterologouspolypeptide from the two-component regulatory polypeptide, the fusionconstruct can be designed to comprise a linker sequence encoding athrombin-sensitive cleavage site, for example the sequence LVPRGS orVIAGR (see, for example, Pryor and Leiting (1997) Protein Expr. Purif.10(3):309-319, and Hong et al. (1997) Chin. Med. Sci. J. 12(3):143-147,respectively, herein incorporated by reference). Cleavage sites for TEVprotease are known in the art. See, for example, the cleavage sitesdescribed in U.S. Patent No. 5,532,142, herein incorporated by referencein its entirety. See also the discussion in Ausubel et al., eds. (1998)Current Protocols in Molecular Biology (John Wiley & Sons, Inc.),Chapter 16.

Antibodies

An isolated polypeptide of the present invention can be used as animmunogen to generate antibodies that specifically bind to the sequenceof the invention or stimulate production of antibodies in vivo. Afull-length polypeptide of the invention can be used as an immunogen or,alternatively, antigenic peptide fragments of the polypeptides describedherein can be used. The antigenic peptide of the polypeptide comprisesat least 8, preferably 10, 15, 20, or 30 amino acid residues of theamino acid sequence shown in even SEQ ID NOS:2-164 and encompasses anepitope of a protein of the invention such that an antibody raisedagainst the peptide forms a specific immune complex with the relatedprotein. Specific epitopes encompassed by the antigenic peptide areregions of can be located on the surface of the protein, e.g.,hydrophilic regions.

Recombinant Expression Vectors and Cells

The nucleic acid molecules of the present invention can be included invectors, which can be expression vectors. “Vector” refers to a nucleicacid molecule capable of transporting another nucleic acid to which ithas been linked. Expression vectors include one or more regulatorysequences and direct the expression of nucleic acids to which they areoperably linked. By “operably linked” is intended that the nucleotidesequence of interest is linked to the regulatory sequence(s) such thatexpression of the nucleotide sequence is allowed (e.g., in an in vitrotranscription/translation system or in a cell when the vector isintroduced into the cell). The term “regulatory sequence” is intended toinclude controllable transcriptional promoters, operators, enhancers,transcriptional terminators, and other expression control elements suchas translational control sequences (e.g., Shine-Dalgarno consensussequence, initiation and termination codons). These regulatory sequenceswill differ, for example, depending on the cell being used.

The vectors can be autonomously replicated in a cell (episomal vectors),or can be integrated into the genome of a cell, and replicated alongwith the host genome (non-episomal mammalian vectors). Integratingvectors can contain at least one sequence homologous to the bacterialchromosome that allows for recombination to occur between homologous DNAin the vector and the bacterial chromosome. Integrating vectors can alsocomprise bacteriophage or transposon sequences. Episomal vectors, orplasmids are circular double-stranded DNA loops into which additionalDNA segments can be ligated. Plasmids capable of stable maintenance in acell are generally the preferred form of expression vectors when usingrecombinant DNA techniques.

The expression constructs or vectors encompassed in the presentinvention comprise a nucleic acid construct of the invention in a formsuitable for expression of the nucleic acid in a cell. Expression inprokaryotic cells is encompassed in the present invention. It will beappreciated by those skilled in the art that the design of theexpression vector can depend on such factors as the choice of the cellto be transformed, the level of expression of protein desired, etc. Theexpression vectors of the invention can be introduced into cells tothereby produce proteins or peptides, including fusion proteins orpeptides, encoded by nucleic acids as described herein (e.g.,two-component regulatory proteins, mutant forms of two-componentregulatory proteins, fusion proteins, etc.).

Regulatory sequences include those that direct constitutive expressionof a nucleotide sequence as well as those that direct inducibleexpression of the nucleotide sequence only under certain conditions. Abacterial promoter is any DNA sequence capable of binding bacterial RNApolymerase and initiating the downstream (3′) transcription of a codingsequence into mRNA. A promoter can have a transcription initiationregion, which is usually placed proximal to the 5′ end of the codingsequence. This transcription initiation region typically includes an RNApolymerase binding site and a transcription initiation site. A bacterialpromoter can also have a second domain called an operator, which canoverlap an adjacent RNA polymerase binding site at which RNA synthesisbegins. The operator permits negative regulated (inducible)transcription, as a gene repressor protein can bind the operator andthereby inhibit transcription of a specific gene. Constitutiveexpression can occur in the absence of negative regulatory elements,such as the operator. In addition, positive regulation can be achievedby a gene activator protein binding sequence, which, if present isusually proximal (5′) to the RNA polymerase binding sequence.

An example of a gene activator protein is the catabolite activatorprotein (CAP), which helps initiate transcription of the lac operon inEscherichia coli (Raibaud et al. (1984) Annu. Rev. Genet. 18:173).Regulated expression can therefore be either positive or negative,thereby either enhancing or reducing transcription. Other examples ofpositive and negative regulatory elements are well known in the art.Various promoters that can be included in the protein expression systeminclude, but are not limited to, a T7/LacO hybrid promoter, a trppromoter, a T7 promoter, a lac promoter, and a bacteriophage lambdapromoter. Any suitable promoter can be used to carry out the presentinvention, including the native promoter or a heterologous promoter.Heterologous promoters can be constitutively active or inducible. Anon-limiting example of a heterologous promoter is given in U.S. Pat.No. 6,242,194 to Kullen and Klaenhammer.

Sequences encoding metabolic pathway enzymes provide particularly usefulpromoter sequences. Examples include promoter sequences derived fromsugar metabolizing enzymes, such as galactose, lactose (lac) (Chang etal. (1987) Nature 198:1056), and maltose. Additional examples includepromoter sequences derived from biosynthetic enzymes such as tryptophan(trp) (Goeddel et al. (1980) Nucleic Acids Res. 8:4057; Yelverton et al.(1981) Nucleic Acids Res. 9:731; U.S. Pat. No. 4,738,921; EPOPublication Nos. 36,776 and 121,775). The beta-lactamase (bla) promotersystem (Weissmann, (1981) “The Cloning of Interferon and OtherMistakes,” in Interferon 3 (ed. I. Gresser); bacteriophage lambda PL(Shimatake et al. (1981) Nature 292:128); the arabinose-inducible araBpromoter (U.S. Pat. No. 5,028,530); and T5 (U.S. Pat. No. 4,689,406)promoter systems also provide useful promoter sequences. See also Balbas(2001) Mol. Biotech. 19:251-267, where E. coli expression systems arediscussed.

In addition, synthetic promoters that do not occur in nature alsofunction as bacterial promoters. For example, transcription activationsequences of one bacterial or bacteriophage promoter can be joined withthe operon sequences of another bacterial or bacteriophage promoter,creating a synthetic hybrid promoter (U.S. Pat. No. 4,551,433). Forexample, the tac (Amann et al. (1983) Gene 25:167; de Boer et al. (1983)Proc. Natl. Acad. Sci. 80:21) and trc (Brosius et al. (1985) J. Biol.Chem. 260:3539-3541) promoters are hybrid trp-lac promoters comprised ofboth trp promoter and lac operon sequences that are regulated by the lacrepressor. The tac promoter has the additional feature of being aninducible regulatory sequence. Thus, for example, expression of a codingsequence operably linked to the tac promoter can be induced in a cellculture by adding isopropyl-1-thio-β-D-galactoside (IPTG). Furthermore,a bacterial promoter can include naturally occurring promoters ofnon-bacterial origin that have the ability to bind bacterial RNApolymerase and initiate transcription. A naturally occurring promoter ofnon-bacterial origin can also be coupled with a compatible RNApolymerase to produce high levels of expression of some genes inprokaryotes. The bacteriophage T7 RNA polymerase/promoter system is anexample of a coupled promoter system (Studier et al. (1986) J. Mol.Biol. 189:113; Tabor et al. (1985) Proc. Natl. Acad. Sci. 82:1074). Inaddition, a hybrid promoter can also be comprised of a bacteriophagepromoter and an E. coli operator region (EPO Publication No. 267,851).

The vector can additionally contain a nucleotide sequence encoding therepressor (or inducer) for that promoter. For example, an induciblevector of the present invention can regulate transcription from the Lacoperator (LacO) by expressing the nucleotide sequence encoding the Ladrepressor protein. Other examples include the use of the lexA gene toregulate expression of pRecA, and the use of trpO to regulate ptrp.Alleles of such genes that increase the extent of repression (e.g.,lacIq) or that modify the manner of induction (e.g., lambda CI857,rendering lambda pL thermo-inducible, or lambda CI+, rendering lambda pLchemo-inducible) can be employed. In addition to a functioning promotersequence, an efficient ribosome-binding site is also useful for theexpression of the fusion construct. In prokaryotes, the ribosome bindingsite is called the Shine-Dalgarno (SD) sequence and includes aninitiation codon (ATG) and a sequence 3-9 nucleotides in length located3-11 nucleotides upstream of the initiation codon (Shine et al. (1975)Nature 254:34). The SD sequence is thought to promote binding of mRNA tothe ribosome by the pairing of bases between the SD sequence and the 3′end of bacterial 16S rRNA (Steitz et al. (1979) “Genetic Signals andNucleotide Sequences in Messenger RNA,” in Biological Regulation andDevelopment: Gene Expression (ed. R. F. Goldberger, Plenum Press, NY).

Two-component regulatory proteins and proteins under the control oftwo-component sensing and regulatory molecules can also be secreted fromthe cell by creating chimeric DNA molecules that encode a proteincomprising a signal peptide sequence fragment that provides forsecretion of the two-component regulatory polypeptides in bacteria (U.S.Pat. No. 4,336,336). The signal sequence fragment typically encodes asignal peptide comprised of hydrophobic amino acids that direct thesecretion of the protein from the cell. The protein is either secretedinto the growth medium (Gram-positive bacteria) or into the periplasmicspace, located between the inner and outer membrane of the cell(Gram-negative bacteria). Preferably there are processing sites, whichcan be cleaved either in vivo or in vitro, encoded between the signalpeptide fragment and the protein of the invention.

DNA encoding suitable signal sequences can be derived from genes forsecreted bacterial proteins, such as the E. coli outer membrane proteingene (ompA) (Masui et al. (1983) FEBS Lett. 151(1):159-164; Ghrayeb etal. (1984) EMBO J. 3:2437-2442) and the E. coli alkaline phosphatasesignal sequence (phoA) (Oka et al. (1985) Proc. Natl. Acad. Sci.82:7212). Other prokaryotic signals include, for example, the signalsequence from penicillinase, Ipp, or heat stable enterotoxin II leaders.

Typically, transcription termination sequences recognized by bacteriaare regulatory regions located 3′ to the translation stop codon andthus, together with the promoter, flank the coding sequence. Thesesequences direct the transcription of an mRNA that can be translatedinto the polypeptide encoded by the DNA. Transcription terminationsequences frequently include DNA sequences (of about 50 nucleotides)that are capable of forming stem loop structures that aid in terminatingtranscription. Examples include transcription termination sequencesderived from genes with strong promoters, such as the trp gene in E.coli as well as other biosynthetic genes.

Bacteria such as Lactobacillus acidophilus generally utilize thetranslation start codon ATG, which specifies the amino acid methionine(which is modified to N-formylmethionine in prokaryotic organisms).Bacteria also recognize alternative translation start codons, such asthe codons GTG and TTG, which code for valine and leucine, respectively.However, when these alternative translation start codons are used as theinitiation codon, these codons direct the incorporation of methioninerather than of the amino acid that they normally encode. Lactobacillusacidophilus NCFM recognizes these alternative translation start sitesand incorporates methionine as the first amino acid.

The expression vectors will have a plurality of restriction sites forinsertion of the sequence of the invention so that it is undertranscriptional regulation of the regulatory regions. Selectable markergenes that ensure maintenance of the vector in the cell can also beincluded in the expression vector. Preferred selectable markers includethose which confer resistance to drugs such as ampicillin,chloramphenicol, erythromycin, kanamycin (neomycin), and tetracycline(Davies et al. (1978) Annu. Rev. Microbiol. 32:469). Selectable markerscan also allow a cell to grow on minimal medium, or in the presence oftoxic metabolite and can include biosynthetic genes, such as those inthe histidine, tryptophan, and leucine biosynthetic pathways.

The regulatory regions can be native (homologous), or can be foreign(heterologous) to the cell and/or the nucleotide sequence of theinvention. The regulatory regions can also be natural or synthetic.Where the region is “foreign” or “heterologous” to the cell, it isintended that the region is not found in the native cell into which theregion is introduced. Where the region is “foreign” or “heterologous” tothe sequence of the invention, it is intended that the region is not thenative or naturally occurring region for the operably linkedtwo-component regulatory nucleotide sequence of the invention. Forexample, the region can be derived from phage. While the sequences couldbe expressed using heterologous regulatory regions, native regions canbe used. Such constructs would be expected in some cases to alterexpression levels of two-component regulatory proteins in the cell.Thus, the phenotype of the cell could be altered.

In preparing the expression cassette, the various DNA fragments can bemanipulated, so as to provide for the DNA sequences in the properorientation and, as appropriate, in the proper reading frame. Towardthis end, adapters or linkers can be employed to join the DNA fragmentsor other manipulations can be involved to provide for convenientrestriction sites, removal of superfluous DNA, removal of restrictionsites, or the like. For this purpose, in vitro mutagenesis, primerrepair, restriction, annealing, resubstitutions, e.g., transitions andtransversions, can be involved.

The invention further provides a vector comprising a nucleic acidmolecule of the invention cloned into the vector in an antisenseorientation. That is, the nucleic acid molecule is operably linked to aregulatory sequence in a manner that allows for expression (bytranscription of the DNA molecule) of an RNA molecule that is antisenseto two-component regulatory mRNA. Regulatory sequences operably linkedto a nucleic acid cloned in the antisense orientation can be chosen todirect the continuous or inducible expression of the antisense RNAmolecule. The antisense expression vector can be in the form of arecombinant plasmid or phagemid in which antisense nucleic acids areproduced under the control of a high efficiency regulatory region, theactivity of which can be determined by the cell type into which thevector is introduced. For a discussion of the regulation of geneexpression using antisense genes see Weintraub et al. (1986)Reviews—Trends in Genetics, Vol. 1(1).

Alternatively, some of the above-described components can be puttogether in transformation vectors. Transformation vectors are typicallycomprised of a selectable marker that is either maintained in a repliconor developed into an integrating vector, as described above.

Microbial or Bacterial Cells

The production of bacteria containing heterologous genes, thepreparation of starter cultures of such bacteria, and methods offermenting substrates, particularly food substrates such as milk, can becarried out in accordance with known techniques, including but notlimited to those described in Mäyrä-Mäkinen and Bigret (1993) LacticAcid Bacteria. Salminen and vonWright eds. Marcel Dekker, Inc. New York.65-96; Sandine (1996) Dairy Starter Cultures Cogan and Accolas eds. VCHPublishers, New York. 191-206; Gilliland (1985) Bacterial StarterCultures for Food. CRC Press, Boca Raton, Fla.

By “fermenting” is intended the energy-yielding, metabolic breakdown oforganic compounds by microorganisms that generally proceeds underanaerobic conditions and with the evolution of gas.

Nucleic acid molecules of the invention can be introduced into cells bymethods known in the art. By “introducing” is intended introduction intoprokaryotic cells via conventional transformation or transfectiontechniques, or by phage-mediated infection. As used herein, the terms“transformation,” “transduction,” “conjugation,” and “protoplast fusion”are intended to refer to a variety of art-recognized techniques forintroducing foreign nucleic acid (e.g., DNA) into a cell, includingcalcium phosphate or calcium chloride co-precipitation,DEAE-dextran-mediated transfection, lipofection, or electroporation.Suitable methods for transforming or transfecting cells can be found inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed.,Cold Spring Harbor Laboratory Press, Plainview, N.Y.) and otherlaboratory manuals.

Bacterial cells used to express the sequences of the invention arecultured in suitable media, as described generally in Sambrook et al.(1989) Molecular Cloning, A Laboratory Manual (2d ed., Cold SpringHarbor Laboratory Press, Plainview, N.Y.).

Two-Component Regulatory Response System Proteins and Related Domains

Bacteria respond to their environment through the interation of tworegulatory proteins in a two-component transduction system. One protein,generally located in the cytoplasmic membrane, is a sensor that monitorsthe environment, while the other is a response regulator that mediatesan adaptive response, often through effecting a change in the expressionof one or more genes. Two-component regulatory system proteins fromdifferent bacterial species share considerable amino acid sequencehomology. The sensor protein, a histidine kinase, has an N-terminaldomain (input domain) (PFAM Accession No. PF00512) that detects stimulieither directly, or though interaction with a receptor. This domain is adimerization and phosphoacceptor domain. The cytoplasmic region(transmitter domain) of the sensor protein is highly conserved, andcomprises two independently folding domains: the phosphotransfer domainand the ATP-binding kinase domain (PFAM Accession No. PF02518). TheN-terminal domain may be linked through the phosphotransfer domain tothe kinase domain by a HAMP domain (PFAM Accession PF00672). Thephosphotransfer domain has a histidine residue in a region called the Hbox, that is involved in protein autophosphorylation and phosphataseactivity. The catalytic (ATP-binding) domain contains regions of aminoacid similarity, including the N, G1, F, and G2 boxes, which have beenclassically defined in alignments of the histidine kinase superfamily.The G1 and G2 boxes are glycine-rich sequences that resemble thenucleotide-binding motifs of other proteins, and the F box is named by aconserved phenyalanine residue. Histidine kinases fall into threesubfamilies, with proteins containing the transmitter domain preceded byan amino-terminal input domain, as described above, in the classicalprotein subfamily. More complex histidine kinases possess a receiverdomain that follows the transmitter domain. This receiver domain issimilar to those from response regulators and is linked to an Hpt module(Histidine containing PhosphoTransfer). The phosphotransfer domain isremote from the kinase domain and separate from the sensor domain. Theseproteins are members of the unorthodox protein subfamily. The thirdsubfamily, the hybrid proteins, are similar to the unorthodox proteins,but the Hpt module is not linked to the receiver.

Histidine kinases may also act as phosphoprotein phosphatases,increasing the dephosphorylation of their cognate response regulators inan ATP-dependent fashion. These phosphorylation/dephosphorylationreactions allow precise control over the amount of the phosphorylatedform of the response regulator in the cell.

Assays to measure histidine kinase activity are well known in the art(see, for example, Stewart et al. (1998) Biochemistry 37:12269-12279;Levit et al. (1999) Biochemistry. 38:6651-6658). Methods for identifyingactive variants of histidine kinase proteins are well known in the art(see, for example, Tawa and Stewart (1994) J. Bacteriol. 176:4210-4218;Hirschman et al. (2001) Biochemistry. 40:13876-13887; Marina et al.(2001) J. Biol. Chem. 276:41182-41190). Methods to identify essentialamino acids in the HAMP domain are well known in the art (see, forexample, Appleman and Stewart (2003) J. Bacteriol. 185:89-97).

The response regulator generally has two domains, a conservedamino-terminal region termed the receiver domain (PFAM Accession No.PF00072), and a C-terminal output domain (or effector domain) (PFAMAccession No. PF00486), which is typically a transcriptional regulator(Pao and Saier (1995) J. Mol. Evol. 40:136-154). The receiver domaincontains three conserved aspartyl and one conserved lysine residue thatcharacterize the response regulator family. The conserved residues foldtogether to form the active site, where an aspartate residue accepts thephosphoryl group from the transmitter histidine residue, oralternatively, from a variety of small molecules (not

ATP). The phosphorylation state of the receiver domain affects theactivity of the output domain to elicit a response.

The transcriptional regulatory protein, C terminal domain is almostalways found associated with the response regulator receiver domain. Itmay play a role in DNA binding (Martinez-Hackert and Stock (1997)Structure January 5:109-124). Most output domains have ahelix-turn-helix DNA-binding motif. Assays to measure activity oftwo-component regulatory systems are well known in the art (see, forexample, Lee et al. (2004) Infect. Immun. 72:3968-3973; Walker andMiller (2004) J. Bacteriol. 186:4056-4066; Saini et al. (2004)Microbiology. 150:865-875; Abo-Amer et al. (2004) J. Bacteriol.186:1879-1889). Methods to identify variants that retain activity arewell known in the art (see, for example, Baruah et al. (2004) J.Bacteriol. 186:1694-1704; Wang et al. (2001) J. Bacteriol.183:2795-2802; Piazza et al. (1999) J. Bacteriol. 181:4540-4548).

Proteins of the present invention having a response regulator receiverdomain and/or a transcriptional regulatory protein, C terminal domaininclude those set forth in SEQ ID NOS:2, 12, 22, 26 and 28. Proteinswith a histidine kinase A (phosphoacceptor) N-terminal domain of thepresent invention include those set forth in SEQ ID NOS:4, 14, 20, 24,30 and 36. Proteins with a histidine kinase-, DNA gyrase B-, andHSP90-like ATPase domain of the present invention include those setforth in SEQ ID NOS:4, 14, 20, 24, 30, 34,and 36. Proteins with a HAMPdomain of the present invention include those set forth in SEQ ID NOS:4,14 and 24. Additional proteins with a response regulator domain of PFAM00072 include SEQ ID NO:32.

The GGDEF domain (PFAM Accession No. PF00990) is found linked to a widerange of non-homologous domains in a variety of bacteria. It has beenshown to be homologous to the adenyl cyclase catalytic domain (Pei andGrishin (2001) Proteins 42:210-216) and has diguanylate cyclase activity(Paul et al. (2004) Genes Dev. 18:715-727; Galperin et al. (2001) FEMSMicrobiol. Lett. 203:11-21). This observation correlates with thefunctional information available on two GGDEF-containing proteins,namely diguanylate cyclase and phosphodiesterase A of Acetobacterxylinum, both of which regulate the turnover of cyclic diguanosinemonophosphate. Assays to measure diguanylate cyclase activity are wellknown in the art (see, for example, Paul et al. (2004) Genes Dev.18:715-727). Proteins with a GGDEF domain of the present inventioninclude those set forth in SEQ ID NO:16.

The EAL domain (PFAM Accession No. PF00563) is found in diversebacterial signaling proteins. It is called EAL for its conservedresidues. The EAL domain is a good candidate for a diguanylatephosphodiesterase function (Galperin et al. (2001) FEMS Microbiol. Lett.203:11-21). The domain contains many conserved acidic residues thatcould participate in metal binding and might form the phosphodiesteraseactive site. It often but not always occurs along with PAS and DUF9domains that are also found in many signaling proteins. Assays tomeasure phosphodiesterase activity are well known in the art (see, forexample, Ausmees et al. (2001) FEMS Microbiol. Lett. 204:163-167).Proteins with a EAL domain of the present invention include those setforth in SEQ ID NO:18.

Many bacterial transcription regulatory proteins bind DNA via ahelix-turn-helix (HTH) motif These proteins are very diverse, but forconvenience may be grouped into subfamilies on the basis of sequencesimilarity (Dehoux and Cossart (1995) Mol. Microbiol. 15:591). The deoRfamily (PFAM Accession No. PF00455) groups together a range of proteins,including lacR, deor, fucR and gutR. Within this family, the HTH motifis situated towards the N-terminus (Mortensen et al. (1989) EMBO J.8:325-331; Rosey and Stewart (1992) J. Bacteriol. 174:6159-6170; Lu andLin (1989) Nucleic Acids Res. 17:4883-4884). One other such family,marR, groups together a range of proteins, including emrR, hpcR, hpR,marR, pecS, petP, papX, prsX, ywaE, yxaD and yybA. The Mar proteins areinvolved in the multiple antibiotic resistance, a non-specificresistance system. The expression of the mar operon is controlled by arepressor, MarR. A large number of compounds induce transcription of themar operon. This is thought to be due to the compound binding to MarR,and the resulting complex stops MarR binding to the DNA. With the MarRrepression lost, transcription of the operon proceeds (Sulavik et al.(1997) J. Bacteriol. 179:1857-1866). Assays to measure transcriptionfactor activity are well known in the art (see, for example, Sulavik etal. (1997) J. Bacteriol. 179:1857-1866). Proteins with a bacterialregulatory protein, deoR domain of the present invention include thoseset forth in SEQ ID NO:40. Proteins in the marR family of the presentinvention include those set forth in SEQ ID NO:58.

Proteins Under the Control of Two-Component Regulatory System Proteins

The Patatin-like phospholipase family (PFAM Accession No. PF01734)consists of various patatin glycoproteins from the total soluble proteinin potato tubers, with some members also found in vertebrates. Patatinis a storage protein but it also has the enzymatic activity of lipidacyl hydrolase, catalysing the cleavage of fatty acids from membranelipids (Mignery et al. (1988) Gene 62:27-44). Proteins in thepatatin-like phospholipase family of the present invention include thoseset forth in SEQ ID NO:44.

The band 7 protein (PFAM Accession No. PF01145) is an integral membraneprotein which is thought to regulate cation conductance by interactingwith other proteins of the junctional complex of the membrane skeleton.A variety of proteins belong to this family. These include theprohibitins, cytoplasmic anti-proliferative proteins and stomatin, anerythrocyte membrane protein. Bacterial HflC protein also belongs tothis family. Structurally, these proteins consist of a short N-terminaldomain which is followed by a transmembrane region and a variable size(from 170 to 350 residues) C-terminal domain. Proteins in the band 7protein family of the present invention include those set forth in SEQID NO:50.

ABC transporters form a large family of proteins responsible fortranslocation of a variety of compounds across biological membranes.They are minimally composed of four domains, with two transmembranedomains (TMDs) (PFAM Accession PF00664) responsible for allocritebinding and transport and two nucleotide-binding domains (NBDs) (PFAMAccession PF00005) responsible for coupling the energy of ATP hydrolysisto conformational changes in the TMDs. Both NBDs are capable of ATPhydrolysis, and inhibition of hydrolysis at one NBD effectivelyabrogates hydrolysis at the other. The proteins belonging to this familyalso contain one or two copies of the ‘A’ consensus sequence (Walker etal. (1982) EMBO J. 1:945-951) or the ‘P-loop’ (Saraste et al. (1990)Trends Biochem Sci. 15:430-434). Methods for measuring ATP-binding andtransport are well known in the art (see, for example, Hung et al.(1998) Nature 396:703-707; Higgins et al. (1990) J. Bioenerg. Biomembr.22:571-592). ABC transporter proteins of the present invention includethose set forth in SEQ ID NOS:60 and 82.

Characterized members of the Multi Antimicrobial Extrusion (MATE) family(PFAM Accession No. PF01554) function as drug/sodium antiporters. Theseproteins mediate resistance to a wide range of cationic dyes,fluroquinolones, aminoglycosides and other structurally diverseantibiotics and drugs. MATE proteins are found in bacteria, archaea andeukaryotes. These proteins are predicted to have 12-helicaltransmembrane regions, some of the animal proteins may have anadditional C-terminal helix. Methods for measuring antibiotic and drugresistance are well known in the art (see, for example, Mitchell et al.(1998) Antimicrob. Agents Chemother. 42:475-477; Mitchell et al. (1999)J. Biol. Chem. 274:3541-3548). Multi Antimicrobial Extrusion (MATE)family proteins of the present invention include those set forth in SEQID NO:72.

Lantibiotic and non-lantibiotic bacteriocins are synthesized asprecursor peptides containing N-terminal extensions (leader peptides),which are cleaved off during maturation. Most non-lantibiotics and alsosome lantibiotics have leader peptides of the so-called double-glycinetype. These leader peptides share consensus sequences and also a commonprocessing site with two conserved glycine residues in positions −1 and−2. The double-glycine-type leader peptides are unrelated to theN-terminal signal sequences, which direct proteins across thecytoplasmic membrane via the sec pathway. Various methods can be used toassay for bacteriocin activity including, for example, the experimentalsection herein, Ogunbanwo et al. (2003) Afr. J. Biotechnology 2:219-227, Allison et al. (1994) J. Bacteriol. 176:2235-2241 and VanLoveren et al. (2000) Caries Research 34:481-485. Examples of amino acidsequences of the present invention that have double-glycine-type leaderpeptides include those set forth in SEQ ID NOS:74, 76, 84, 86, 90, 92,96 and 114.

The processing sites of these peptides are different from typical signalpeptidase cleavage sites, suggesting that a different processing enzymeis involved. Peptide bacteriocins are exported across the cytoplasmicmembrane by a dedicated ATP-binding cassette (ABC) transporter. The ABCtransporter is the maturation protease and its proteolytic domainresides in the N-terminal part of the protein (Havarstein et al. (1995)Mol. Microbiol. 16:229-240). This peptidase domain is found in a widerange of ABC transporters, however the presumed catalytic cysteine andhistidine are not conserved in all members of this family. Peptidasesare grouped into clans and families. Clans are groups of families forwhich there is evidence of common ancestry. Families are grouped bytheir catalytic type, the first character representing the catalytictype: S, serine; T, threonine; C, cysteine; A, aspartic; M, metallo andU, unknown. A clan that contains families of more than one type isdescribed as being of type P. The serine, threonine and cysteinepeptidases utilise the catalytic part of an amino acid as a nucleophileand form an acyl intermediate—these peptidases can also readily act astransferases. In the case of aspartic and metallopeptidases, thenucleophile is an activated water molecule.

Cysteine peptidases have characteristic molecular topologies, which canbe seen not only in their three-dimensional structures, but commonlyalso in the two-dimensional structures. The peptidase domain isresponsible for peptide bond hydrolysis; in Merops this is termed thepeptidase unit. These are peptidases in which the nucleophile is thesulphydryl group of a cysteine residue. Cysteine proteases are dividedinto clans (proteins which are evolutionary related), and furthersub-divided into families, on the basis of the architecture of theircatalytic dyad or triad (Barrett and Rawlings (2001) Biol. Chem.382:727-733). The peptidase C39 family (clan CA) (PFAM Accession No.PF03412) is found in a wide range of ABC transporters, which arematuration proteases for peptide bacteriocins, the proteolytic domainresiding in the N-terminal region of the protein (Rawlings and Barrett(1995) Methods Enzymol. 248:183-228). Assays for measuring peptidaseactivity are well known in the art (see, for example, (Havarstein et al.(1995) Mol. Microbiol. 16:229-240). Proteins of the present invention inthe peptidase C39 family include those set forth in SEQ ID NO:82.

RelE and RelB form a toxin-antitoxin system. RelE represses translation,probably through binding ribosomes (Pedersen et al. (2002) Mol Microbiol45:501-510 and Terry et al. (2001) J. Bacteriol 183:2700-2703). Apolypeptide having a RelE and RelB domain is set forth in SEQ ID NO:52.

Viruses, parasites and bacteria are covered in protein and sugarmolecules that help them gain entry into a host by counteracting thehost's defences. One such molecule is the M protein produced by certainstreptococcal bacteria. M proteins embody a motif that is now known tobe shared by many Gram-positive bacterial surface proteins. The motifincludes a conserved hexapeptide, which precedes a hydrophobicC-terminal membrane anchor, which itself precedes a cluster of basicresidues. It has been proposed that this hexapeptide sequence isresponsible for a post-translational modification necessary for theproper anchoring of the proteins which bear it, to the cell wall. Apolypeptide having such a domain is found in SEQ ID NO:78.

The LytTr domain is found in a variety of bacterial transcriptionalregulators. The domain binds to a specific DNA sequence pattern (seeNikolskya et al. (2002) Nucleic Acid Research 30:2453-459). The LytTrdomain is a DNA-binding, potential winged helix-turn-helix domain (˜100residues) present in a variety of bacterial transcriptional regulatorsof the algR/agrA/lytR family. It is named after the lytR responseregulators involved in the regulation of cell autolysis. The LytTrdomain binds to a specific DNA sequence pattern in the upstream regionsof target genes. The N-terminal of the protein contains a responseregulator receiver domain. The consensus sequence for this domain is inPFAM04397. A polypeptide having this domain is set forth in SEQ IDNO:32.

Members of the CAAX amino terminal protease family are probablyproteases. The family contains CAAX prenyl protease. The proteinscontain a highly conserved Glu-Glu motif at the amino end of thealignment. The alignment also contains two histidine residues that maybe involved in zinc binding. This family consists of varioushypothetical protein sequences for which the function is unknown. One ofthe proteins is an abortive infection protein that confers resistance tothe bacteriophage Phi 712. AbiG is an abortive infection (Abi) mechanismencoded by the conjugative plasmid pCI750 originally isolated fromLactococcus lactis subsp. cremoris UC653. The resistance mechanism actsat neither the phage adsorption or phage DNA restriction level. Also inthis family is a series of bacteriocin-like peptides PlnP, PlnI, PlnT,PlnP and PlnU from Lactobacillus plantarum C11. Lactobacillus plantarumC11 secretes a small cationic peptide, plantaricin A, that serves as aninduction signal for bacteriocin production as well as transcription ofplnABCD. The plnABCD operon encodes the plantaricin A precursor (PlnA)itself and determinants (PlnBCD) for a signal transducing pathway. Theconsensus sequence for this domain is in PFAM12517. A polypeptide havingthis domain is set forth in SEQ ID NO:98 and 102.

Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family31 comprises of enzymes that are, or similar to, alpha-galactosidases.O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymesthat hydrolyse the glycosidic bond between two or more carbohydrates, orbetween a carbohydrate and a non-carbohydrate moiety. A classificationsystem for glycosyl hydrolases, based on sequence similarity, has led tothe definition of 85 different families. This classification isavailable on the CAZy (CArbohydrate-Active EnZymes) web sitePUBMED:PUB00007032.

Because the fold of proteins is better conserved than their sequences,some of the families can be grouped in ‘clans’. Glycoside hydrolasefamily 31 comprises enzymes with several known activities; α-glucosidase(EC:3.2.1.20), α-galactosidase (EC:3.2.1.22); glucoamylase (EC:3.2.1.3),sucrase-isomaltase (EC:3.2.1.48) (EC:3.2.1.10); α-xylosidase (EC:3.2.1);α-glucan lyase (EC:4.2.2.13). Glycoside hydrolase family 31 groups anumber of glycosyl hydrolases on the basis of sequence similaritiesPUBMED:1747104, PUBMED:1761061, PUBMED:1743281 An aspartic acid has beenimplicated PUBMED:1856189 in the catalytic activity of sucrase,isomaltase, and lysosomal α-glucosidase. The consensus sequence for thisdomain is in PFAM01055. A polypeptide having this domain is set forth inSEQ ID NO:116.

The mur ligase family, glutamate ligase domain contains a number ofrelated ligase enzymes which have EC numbers 6.3.2. This familyincludes: MurC, MurD, MurE, MurF, Mpl and FolC. MurC, MurD, Mure andMurF catalyse consecutive steps in the synthesis of peptidoglycan.Peptidoglycan consists of a sheet of two sugar derivatives, with one ofthese N-acetylmuramic acid attaching to a small pentapeptide. Thepentapeptide is is made of L-alanine, D-glutamic acid,Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety issynthesised by successively adding these amino acids toUDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfersthe D-glutamate, MurE transfers the diaminopimelic acid, and MurFtransfers the D-alanyl alanine This family also includesFolylpolyglutamate synthase that transfers glutamate tofolylpolyglutamate. Proteins containing this domain include a number ofrelated ligase enzymes that catalyse consecutive steps in the synthesisof peptidoglycan. Proteins also include folylpolyglutamate synthase thattransfers glutamate to folylpolyglutamate and cyanophycin synthetasethat catalyses the biosynthesis of the cyanobacterial reserve materialmulti-L-arginyl-poly-L-aspartate (cyanophycin). The C-terminal domain isalmost always associated with the cytoplasmic peptidoglycan synthetases,N-terminal domain. The consensus sequence for this domain is inPFAM02875. A polypeptide having this domain is set forth in SEQ IDNO:118.

ATP-binding cassette (ABC) transporters are multidomain membraneproteins, responsible for the controlled efflux and influx of substances(allocrites) across cellular membranes. They are minimally composed offour domains, with two transmembrane domains (TMDs) responsible forallocrite binding and transport and two nucleotide-binding domains(NBDs) responsible for coupling the energy of ATP hydrolysis toconformational changes in the TMDs. Both NBDs are capable of ATPhydrolysis, and inhibition of hydrolysis at one NBD effectivelyabrogates hydrolysis at the other. Hydrolysis at the two NBDs may occurin an alternative fashion although they appear substantiallyfunctionally symmetrical in terms of their binding to diversenucleotides. A number of bacterial transport systems have been found tocontain integral membrane components that have similar sequences: thesesystems fit the characteristics of ATP-binding cassette transporters.The proteins form homo- or hetero-oligomeric channels, allowingATP-mediated transport. Hydropathy analysis of the proteins has revealedthe presence of 6 possible transmembrane regions. These proteins belongto family 2 of ABC transporters. The consensus sequence for this domainis in PFAM01061. A polypeptide having this domain is set forth in SEQ IDNO:120 and 122.

ATP-binding cassette (ABC) transporters are multidomain membraneproteins, responsible for the controlled efflux and influx of substances(allocrites) across cellular membranes. They are minimally composed offour domains, with two transmembrane domains (TMDs) responsible forallocrite binding and transport and two nucleotide-binding domains(NBDs) responsible for coupling the energy of ATP hydrolysis toconformational changes in the TMDs. Both NBDs are capable of ATPhydrolysis, and inhibition of hydrolysis at one NBD effectivelyabrogates hydrolysis at the other. Hydrolysis at the two NBDs may occurin an alternative fashion although they appear substantiallyfunctionally symmetrical in terms of their binding to diversenucleotides. A variety of ATP-binding transport proteins have a sixtransmembrane helical region. They are all integral membrane proteinsinvolved in a variety of transport systems. Members of this familyinclude; the cystic fibrosis transmembrane conductance regulator (CFTR),bacterial leukotoxin secretion ATP-binding protein, multidrug resistanceproteins, the yeast leptomycin B resistance protein, the mammaliansulphonylurea receptor and antigen peptide transporter 2. Many of theseproteins have two such regions. The consensus sequence for this domainis in PFAM00664. A polypeptide having this domain is set forth in SEQ IDNO:120 and 122.

GTPase of unknown function family is a member of the G-proteinsuperfamily clan. This clan includes the following Pfam members: NOG1;MMR_HSR1; IIGP; GTP_EFTU; GTP_CDC; Dynamin_N; DUF258; Arf; AIG1; HumanHSR1, has been localized to the human MHC class I region and is highlyhomologous to a putative GTP-binding protein, MMR1 from mouse. Theseproteins represent a new subfamily of GTP-binding proteins that has bothprokaryote and eukaryote members. The consensus sequence for this domainis in PFAM01926. A polypeptide having this domain is set forth in SEQ IDNO:154.

Proteins containing the ParB-like nuclease domain, appear to be relatedto the Escherichia coli plasmid protein ParB, which preferentiallycleaves single-stranded DNA. ParB also nickssupercoiled plasmid DNApreferably at sites with potential single-stranded character, likeAT-rich regions and sequences that can form cruciform structures. ParBalso exhibits 5-3 exonuclease activity. The consensus sequence for thisdomain is in PFAM02195. A polypeptide having this domain is set forth inSEQ ID NO:158 and 162.

The CobQ/CobB/MinD/ParA nucleotide binding domain family consists ofvarious cobyrinic acid a,c-diamide synthases. These include CbiA andCbiP from S. typhimurium (Pollich et al. (1995) J. Bacteriol177:1487-4487, and CobQ from R. capsulatus (Roth et al. (1993) JBacteriol 175:3303-3316. These amidases catalyse amidations to variousside chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide inthe biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III.Vitamin B12 is an important cofactor and an essential nutrient for manyplants and animals and is primarily produced by bacteria (Pollich et al.(1995) J. Bacteriol 177:1487-4487). The family also containsdethiobiotin synthetases as well as the plasmid partitioning proteins ofthe MinD/ParA family (Raux et al. (1998) Biochem J 335:159-166). Thisentry consists of various cobyrinic acid a,c-diamide synthases. Theseinclude CbiA and CbiP from Salmonella typhimurium, and CobQ fromRhodobacter capsulatus. These amidases catalyse amidations to variousside chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide inthe biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III.Vitamin B12 is an important cofactor and an essential nutrient for manyplants and animals and is primarily produced by bacteria. The consensussequence for this domain is in PFAM01656. A polypeptide having thisdomain is set forth in SEQ ID NO:160.

Glucose inhibited division protein is a family of bacterial Glucoseinhibited division proteins these are probably involved in theregulation of cell division. This family is a member of theMethyltransferase superfamily clan. This clan includes the followingPfam members: CheR; CMAS; Cons_hypoth95; DNA_methylase; DOT1; Eco57I;Fibrillarin; FtsJ; GidB; MethyltransfD12; Methyltransf_(—)10;Methyltransf_(—)2; Methyltransf_(—)3; Methyltransf_(—)4;Methyltransf_(—)5; Methyltransf_(—)8; Methyltransf_(—)9; Met_(—)10;Mg-por_mtran_C; MT-A70; MTS; N6_Mtase; N6_N4_Mtase; NNMT_PNMT_TEMT;NodS; Nol1_Nop2_Fmu; PARP_regulatory; PCMT; PrmA; RrnaAD;rRNA_methylase; Spermine_synth; TehB; TPMT; TRM; tRNA_U5-meth_tr;Ubie_methyltran; UPF0020. GidB (glucose-inhibited division protein B)appears to be present and in a single copy in all complete eubacterialgenomes so far. Its mode of action is unknown, but a methytransferasefold is reported from the crystal structure. It may be a family ofbacterial glucose inhibited division proteins that are involved in theregulation of cell division. A polypeptide having this domain is setforth in SEQ ID NO:164.

Methods of Use

Many two-component response systems are known in bacteria, including,but not limited to, the Arc two-component signal transduction system ofE. coli, which regulates numerous operons in response to respiratorygrowth conditions (see, for example, Kwon et al. (2000) J. Bacteriol.182:2960-2966); PhoQ/PhoP, which responds to changes in environmentallevels of Mg²⁺ (see, for example, Marina et al. (2001) J. Biol. Chem.276:41182-41190; PmrAB, which modulates resistance to cationicantimicrobial peptides (see, for example, Moskowitz et al. (2004) J.Bact. 186:575-579); EnvZ/OmpR, which respond to changes in osmoticconditions (see, for example, Cai and Inouye (2002) J. Biol. Chem.277:24155-24161); NarX/NarL, which respond to nitrite levels (see, forexample, Stewart (1994) Antonie Van Leeuwenhoek 66:37-45); PhoR/PhoB,which responds to low phosphate concentrations in the environment andperiplasmic space (see, for example, Pragai et al. (2004) J. Bacteriol.186:1182-1190); covRS, which regulates expression offructosyltransferase (see, for example, Lee et al. (2004) Infect. Immun.72:3968-3973); and RegB/RegA, which is a highly conservedredox-responding global two-component regulatory system from Rhodobactercapsulatus and Rhodobacter sphaeroides (see, for example, Elsen et al.(2004) Microbiol. Mol. Biol. Rev. 68:263-279).

The two-component regulatory system proteins of the present are usefulin regulating the response of an organism to various environmentalconditions. Methods are provided wherein properties of microbes used infermentation are modified to provide bacterial strains able to survivestressful conditions, such as acid or alkaline stress, osmotic oroxidative stress, starvation, or in the presence of other microorganisms(see, for example, Wick and Egli (2004) Adv. Biochem. Eng. Biotechnol.89:1-45). This ability to survive stressful environmental conditionswill increase the utility of these microorganisms in fermenting variousfoods, as well as allowing them to provide longer-lasting probioticactivity after ingestion. One way this may occur is by enhancing theability of an organism to survive passage through the gastrointestinaltract. In general the methods comprise overexpressing one or moreproteins controlled by two-component sensing and regulatory systems. Inone embodiment, the protein is a bacteriocin. By “overexpressing” ismeant that the protein of interest is produced in an increased amount inthe modified bacterium compared to its production in a wild-typebacterium.

The proteins and nucleic acid sequences encoding them may increase theability of a microorganism to survive in the presence of anantimicrobial (see, for example, Moskowitz et al. (2004) J. Bact.186:575-579). They may also enable an microorganism to form a biofilm(see, for example, Danhorn et al. (2004) J. Bacteriol. 186:4492-4501).

The proteins and nucleic acid sequences encoding them may enable anorganism to respond to an environmental stimuli, including, but notlimited to, turgor pressure, a chemical stimulus, heavy-metal cations,oxygen, iron, an antimicrobial, and glucose.

TABLE 1 Two-Component Sensing and Regulatory Proteins of the PresentInvention SEQ ID ORF# NO: GENE FUNCTION 78 1, 2 VicR response regulatorDNA binding/transcription regulation 79 3, 4 VicK histidine kinaseTwo-component sensing/signal transduction/ATP binding 248 5, 6Two-component response regulator DNA binding/transcription regulation602 7, 8 Histidine kinase Two-component sensing/signal transduction/ATPbinding 603  9, 10 Response regulator DNA binding/transcriptionregulation 746 11, 12 Response regulator DNA binding/transcriptionregulation 747 13, 14 Histidine kinase Two-component sensing/signaltransduction/ATP binding 1413 15, 16 Sensory transduction systemregulatory Two-component components (Histidine kinase?) sensing/signaltransduction/ATP binding 1414 17, 18 Response regulator DNAbinding/transcription regulation 1430 19, 20 Histidine kinaseTwo-component sensing/signal transduction/ATP binding 1431 21, 22Response regulator DNA binding/transcription regulation 1524 23, 24 LisKhistidine kinase Two-component sensing/signal transduction/ATP binding1525 25, 26 LisR response regulator DNA binding/transcription regulation1659 27, 28 Response regulator DNA binding/transcription regulation 166029, 30 Sensory protein kinase Two-component sensing/signaltransduction/ATP binding 1798 31, 32 Response regulator DNAbinding/transcription regulation 1799 33, 34 Sensory histidine kinaseTwo-component sensing/signal transduction/ATP binding 1819 35, 36Sensory histidine kinase Two-component sensing/signal transduction/ATPbinding 1820 37, 38 Response regulator DNA binding/transcriptionregulation 599 39, 40 Transcriptional regulator DeoR Transcriptionregulation 600 41, 42 Phosphoketolase 601 43, 44 Patatin-likephospholipase/protease Nutrient reservoir activity 604 45, 46 PlnIBacteriocin immunity 1563 47, 48 Flavodoxin 1564 49, 50 Membrane proteinCation conductance regulation 1565 51, 52 DNA-damage-inducible protein J1566 53, 54 Helveticin Antimicrobial 595 55, 56 Hydrolase 596 57, 58MarR transcriptional regulator Transcription regulation 597 59, 60Multidrug resistance ABC transporter ATP binding/transport 598 61, 62Immunity protein Antimicrobial immunity 1567 63, 64 Aminopeptidase 156865, 66 Surface protein 1569 67, 68 Transposase 1570 69, 70 Transposase1571 71, 72 MatE membrane protein Antiporter/multidrug transport 179173, 74 Bacteriocin Antimicrobial 1792 75, 76 Bacteriocin Antimicrobial1793 77, 78 Hypothetical protein 1794 79, 80 ORF2, gassericin accessoryprotein 1796 81, 82 PlnG Peptidase/ATP binding/transport 1797 83, 84Acidocin J1132 two-component bacteriocin Antimicrobial 1800 85, 86Bacteriocin Antimicrobial 1801 87, 88 Unknown protein 1802 89, 90Bacteriocin Antimicrobial 1803 91, 92 Bacteriocin Antimicrobial 1804 93,94 Hypothetical protein 1805 95, 96 Bacteriocin Antimicrobial 1808 97,98 Immunity protein Antimicrobial immunity 1809  99, 100 Hypotheticalprotein 1810 101, 102 Immunity protein Antimicrobial immunity 1811 103,104 Hypothetical protein 1812 105, 106 Alpha-glucosidase II 1813 107,108 Hypothetical protein 1814 109, 110 Unknown protein 1815 111, 112Hypothetical protein 1816 113, 114 Bacteriocin Antimicrobial 1817 115,116 Aspartate racemase 1818 117, 118 UDP-N-acetylmuramyl TP synthase1821 119, 120 Transporter 1822 121, 122 Transporter 80 123, 124 YycHprotein 81 125, 126 YycI protein 82 127, 128 Hypothetical protein 83129, 130 HtrA serine protease 1421 131, 132 Oxidoreductase 1422 133, 134Pyrazinamidase/nicotinamidase 1423 135, 136 Unknown protein 1424 137,138 Amino acid permease 1425 139, 140 Hypothetical protein 1426 141, 142Unknown protein 1427 143, 144 Oxidoreductase 1428 145, 146 Hypotheticalprotein 1429 147, 148 Transporter 1432 149, 150 Hypothetical protein1823 151, 152 Uncharacterized membrane protein 1824 153, 154 GTPase 1825155, 156 Unknown protein 1826 157, 158 Predicted transcription regulator1827 159, 160 ParA ATPase 1828 161, 162 Predicted transcriptionregulator 1829 163, 164 Predicted S-adenosylmethionine transferase

The following examples are offered by way of illustration and not by wayof limitation.

Experimental Example 1 The Lactobacillus acidophlius NCFM Genome

The complete genome of Lactobacillus acidophilus NCFM consists of1,993,570 nucleotides with an average GC content of 34.71%. In silicoanalyses revealed the presence of 1864 open reading frames (ORFs)resulting in a coding percentage of 87.9%. One or more protein families(PFam) were attributed to 75% of these ORFs and 89% showed similaritiesto at least one COG (cluster of orthologous groups of proteins). As aresult of the manual annotation curation, only 11.7% of the ORFsremained unknown and 15.8% showed similarities to unclassified genes ofother organisms. Of the predicted ORFs, 72.5% were assigned to a definedfunction. Sequences from the genome of Lactobacillus acidophilus NCFMhave been described in U.S. Provisional Patent Application No.60/465,621 filed on Apr. 23, 2003, U.S. Provisional Patent ApplicationNo. 60/480,764 filed on Jun. 23, 2003, U.S. Provisional PatentApplication No. 60/546,745 filed on Feb. 23, 2004, U.S. ProvisionalPatent Application No. 60/551,121 filed on Mar. 8, 2004, U.S.Provisional Patent Application No. 60/551,161 filed on Mar. 8, 2004 andU.S. Provisional Patent Application No. 60/662,712 filed on Oct. 27,2004, and U.S. patent application Ser. No. 10/831,070 filed on Apr. 23,2004, U.S. patent application Ser. No. 10/873,467 filed on Jun. 22,2004, U.S. patent application Ser. No. 11/074,176 filed on Mar. 7, 2005and U.S. patent application Ser. No. 11/074,226 filed on Mar. 7, 2005,the disclosures of which are incorporated herein by reference in theirentireties.

The Origin of Replication was predicted by GC-skew analysis and the ORForientation shift. Directly adjacent to this locus, a gene showingsignificant similarities to dnaA was identified. Further analysesrevealed the presence of a highly conserved gene arrangement (rnpA, ORFLa1978; rpmH, ORF La1979; dnaA, ORF La1; dnaN, ORF La2; recF, ORF La4;and gyrB, ORF La5) which can be found in a wide range of otherprokaryotes, including Bacillus subtilis, Escherichia coli, andSynechococcus (Liu and Tsinoremas (1996) Gene 172:105-109, Ogasawara etal. (1985) EMBO J. 4:3345-3350). In order to initiate the chromosomereplication, DnaA requires the presence of several DnaA-boxes (Fujikawaet al. (2003) Nucleic Acids Res. 31:2077-2086). Seven DnaA-boxes with alength of 8 nucleotides were determined directly upstream of dnaA,whereas only one was identified downstream of dnaA. Accordingly, thisregion was designated oriC and most likely represents the DNAreplication initiation locus. Subsequently, the genome sequence wasrotated and starts 30 nucleotides upstream of dnaA. The Terminus of DNAreplication was identified similarly by GC-skew and ORF orientationshift analysis. The exact position could not be determined, since noreplication terminator protein could be identified (Griffiths et al.(1998) J. Bacteriol. 180:3360-3367). However, a chromosome segregationhelicase (ORF La1077) and DnaD (ORF La1161) were identified at theproposed Terminus locus. In addition, a genome region of ˜300 kilobasepairs with the predicted Terminus in its center showed a significantlylower average GC content. This lower GC content could aid in theseparation of the chromosomal strands. The Origin and the Terminus ofDNA replication are placed fairly symmetrical in the genome.

Sixty-one tRNAs were identified within the genome. Only 8 tRNAs werelocated on the lagging strand, mostly clustered around an rRNA locus.tRNAs for all 21 amino acids were found with redundant tRNAs for allamino acids except cysteine and tryptophan. Ribosomal proteins weremainly assembled around one locus at 260 kilobase pairs. Four ribosomalRNA loci were identified throughout the genome. Three of them wereclustered within the first 500 kilobase pairs and oriented in the samesense-direction, whereas the fourth rRNA locus, located at ˜1.6megabases, is oriented in the opposite direction. Thus, all rRNA lociwere in phase with the direction of DNA replication.

The COG database classifies paralogous proteins of at least threelineages into functionally related groups. Three major sections arecurrently described and a forth section includes proteins with poorlycharacterized functions. The graphical representation of the COGdistribution shows that the majority of predicted proteins (64.4%) couldbe classified into the three functional classes and only 19% wereassigned to the “poorly characterized” group. However, 6.6% of COGscould not be assigned into any classification, designated here as COGcategory 5. Of those, five genome regions stand out, due to their visualdominance (COG-I to COG-V). Functional annotation revealed that all ofthe genes present in these COG category 5 regions Ito V were predictedto be involved in cell-adherence and initial host-cell recognition(i.e., ORF La1016-ORF La1020, ORF La1377, ORF La1392: mucus bindingproteins; ORF La1606-ORF La1612: fibronectin binding proteins; and ORFLa1633-ORF La1636: surface bound proteins). Further analyses of otherorganisms might lead to a separate COG group within the extracellularstructures (functional category W) to reflect this set of proteins andtheir common function.

Analysis of the GC-content distribution showed localized peak deviationsfrom the average GC content of the genome. Without exceptions,GC-content spikes were found to harbor the four rRNA loci (average GCcontent of 50.88%), whereas the two neighboring low GC-regions at 1.75megabases (average GC content of 28.5%) revealed the presence of a largeuncharacterized region unique to Lactobacillus acidophilus NCFM and anEPS cluster. The EPS cluster consisted of fourteen genes including thehighly conserved proteins EpsA-EpsF (ORF La1732-ORF La1737), EpsJ (ORFLa1725 and ORF La1726), and EpsI (ORF La1724) and five variable proteins(ORF La1727-ORF La1731) representing glycosyl transferases andpolysaccharide polymerases. Together, this set shows high synteny toreported exopolysaccharide (EPS) clusters in streptococci (Stingele etal. (1996) J. Bacteriol. 178:1680-1690) and recently reported in L.gasseri and L. johnsonii (Pridmore et al. (2004) Proc. Natl. Acad. Sci.U.S.A. 101:2512-2517). Scanning electron microscopy of NCFM did notdetect an external polysaccharide layer (Hood and Zottola (1987) J. FoodSci. 52:791), and it remains unclear whether the EPS cluster isfunctional or if any EPS produced is excreted rather than anchored.Three ORFs in the NCFM EPS cluster encode for two UDP-galactopyranosemutases and a membrane protein involved with the export of O-antigen andteichoic acid. Other teichoic acid associated ORFs include a tandem setof teichoic acid biosynthesis and transport proteins (ORF La524 and ORFLa525), another predicted biosynthetic protein (ORF La519), two morepolysaccharide transporters specific to O-antigen and teichoic acid (ORFLa1614 and ORF La1917), along with a cell wall teichoic acidglycosylation protein (ORF La621). An exaggerated inflammatory responsefrom intestinal epithelial cells to gram-negative bacteria can betempered by teichoic acids from lactobacilli (Vidal et al. (2002)Infect. Immun. 70:2057-2064) suggesting an intimate involvement ofteichoic acids and the immune system. The uncharacterized low GC regionsand the EPS cluster are centered on two divergently orientedtransposases (ORF La1722, ORF La1721, and ORF La1720). The exceptionallylow GC content and the presence of mobile elements could indicate theacquisition of this region via horizontal gene transfer.

The NCFM genomic DNA sequence was analyzed for repetitive DNA by a“repeat and match analysis.” One intergenic region between ORF La1550(DNA polymerase I, polA) and ORF La1551 (putativephosphoribosylamine-glycine ligase, purD) had features characteristic ofa SPIDR (SPacers Interspersed Direct Repeats) locus. This region wasapproximately 2.4 kilobases long and contained 32 nearly perfect repeatsof 29 base pairs separated by unique 32 base pair spacers. The SPIDRlocus constitutes a novel family of repeat sequences that are present inBacteria and Archaea but not in Eukarya (Jansen et al. (2002) OMICS6:23-33). The repeat loci typically consist of repetitive stretches ofnucleotides with a length of 25 to 37 base pairs alternated bynonrepetitive DNA spacers of approximately equal size as the repeats. Todate, SPIDR loci have been identified in more than forty microorganisms(Jansen et al. (2002) OMICS 6:23-33), but from the lactic acid bacteria,have only been described from Streptococcus spp. Despite their discoveryover 15 years ago in E. coli (Ishino et al. (1987) J. Bacteriol.169:5429-5433), no physiological function has yet been elucidated.

Example 2 Gapped BlastP Results for Amino Acid Sequences

A Gapped BlastP sequence alignment showed that SEQ ID NO:2 (238 aminoacids) has about 83% identity from amino acids 1-237 with a protein fromLactobacillus johnsonii that is a two-component regulatory systemresponse regulator (Accession No. NP_(—)964081), about 83% identity fromamino acids 1-237 with a protein from Lactobacillus gasseri that is aresponse regulator consisting of a CheY-like receiver domain and awinged-helix DNA-binding domain (ZP_(—)00046798), about 71% identityfrom amino acids 1-237 with a protein from Lactobacillus sakei that is aputative response regulator (Accession No. AAD10263), about 72% identityfrom amino acids 3-237 with a protein from Enterococcus faecalis that isa DNA-binding response regulator VicR (Accession No. NP_(—)814922), andabout 72% identity from amino acids 3-237 with a protein fromEnterococcus faecalis that is a response regulator VicR (Accession No.CAB64972).

A Gapped BlastP sequence alignment showed that SEQ ID NO:4 (618 aminoacids) has about 67% identity from amino acids 7-618 with a protein fromLactobacillus johnsonii that is a two-component regulatory systemhistidine kinase (Accession No. NP_(—)964082), about 66% identity fromamino acids 7-618 with a protein from Lactobicillus gasseri that is asignal transduction histidine kinase (Accession No. ZP_(—)00046799),about 54% identity from amino acids 2-618 with a protein fromLactobacillus sakei that is a putative histidine kinase (Accession No.AAD 10264), about 52% identity from amino acids 4-617 with a proteinfrom Lactobacillus plantarum that is a histidine kinase sensor protein(Accession No. NP_(—)783897), and about 47% identity from amino acids12-616 with a protein from Enterococcus faecalis that is a sensory boxhistidine kinase VicK (Accession No. NP_(—)814923).

A Gapped BlastP sequence alignment showed that SEQ ID NO:6 (150 aminoacids) has about 79% identity from amino acids 1-150 with a hypotheticalprotein LJ0247 from Lactobacillus johnsonii (Accession No.NP_(—)964263), about 70% identity from amino acids 45-150 with a proteinfrom Lactobacillus gasseri that is a response regulator of the LytR/AlgRfamily (Accession No. ZP_(—)00046165), about 37% identity from aminoacids 4-150 with a protein from Streptococcus mutans that is a putativetranscriptional regulator (Accession No. NP_(—)720879), about 40%identity from amino acids 18-150 with a protein from Oenococcus oenithat is a response regulator of the LytR/AlgR family (Accession No.ZP_(—)00069670), and about 31% identity from amino acids 1-148 with aprotein from Leuconostoc mesenteroides that is a response regulator ofthe LytR/AlgR family (Accession No. ZP_(—)00063955).

A Gapped BlastP sequence alignment showed that SEQ ID NO:8 (426 aminoacids) has about 32% identity from amino acids 20-425 with a proteinfrom Lactobacillus johnsonii that is a lactacin F two-component systemhistidine kinase (Accession No. NP_(—)964617), about 32% identity fromamino acids 10-425 with a protein from Lactobacillus salvarius that isAbpK (Accession No. AAM61782), about 27% identity from amino acids22-426 with a protein from Lactobacillus johnsonii that is atwo-component system histidine kinase (Accession No. NP_(—)964473),about 34% identity from amino acids 141-426 with a protein fromCarnobacterium piscicola that is a putative histidine kinase PisK(Accession No. AAK69421), and about 31% identity from amino acids132-426 with a protein from Lactobacillus sakei that is a histidinekinase homolog SapK (Accession No. CAA86944).

A Gapped BlastP sequence alignment showed that SEQ ID NO:10 (265 aminoacids) has about 41% identity from amino acids 2-259 with a protein fromLactobacillus salivarius that is AbpR (Accession No. AAM61783), about40% identity from amino acids 1-256 with a protein from Lactobacillusjohnsonii that is a lactacin F two-component system response regulator(Accession No. NP_(—)964619), about 32% identity from amino acids 3-242with a protein from Lactobacillus johnsonii that is a two-componentsystem response regulator (Accession No. NP_(—)964474), about 29%identity from amino acids 1-250 with a protein from Lactobacillus sakeithat is a sakacin A production response regulator SapR (Accession No.CAA86945), and about 29% identity from amino acids 1-246 with a proteinfrom Carnobacterium piscicola that is a response regulator (AccessionNo. AAB81306).

A Gapped BlastP sequence alignment showed that SEQ ID NO:12 (240 aminoacids) has about 73% identity from amino acids 3-239 with a protein fromLactobacillus gasseri that is a response regulator consisting of aCheY-like receiver domain and a winged-helix DNA-binding domain(Accession No. ZP_(—)00046225), about 63% identity from amino acids3-239 with a protein from Lactobacillus sakei that is a putativeresponse regulator (Accession No. AAD10267), about 63% identity fromamino acids 3-236 with a protein from Enterococcus facium that is aresponse regulator consisting of a CheY-like receiver domain and awinged-helix DNA-binding domain (ZP_(—)00036862), about 62% identityfrom amino acids 3-240 with a protein from Lactobacillus plantarum thatis a response regulator (Accession No. NP_(—)785945), and about 62%identity from amino acids 3-236 with a protein from Enterococcusfaecalis that is a DNA-binding response regulator (Accession No.NP_(—)814983).

A Gapped BlastP sequence alignment showed that SEQ ID NO:14 (483 aminoacids) has about 61% identity form amino acids 1-482 with a protein fromLactobacillus johnsonii that is a two-component system histidine kinase(Accession No. NP_(—)964774), about 61% identity from amino acids 8-482with a protein from Lactobacillus gasseri that is a signal transductionhistidine kinase (Accession No. ZP_(—)00046226), about 45% identity fromamino acids 1-475 with a protein from Lactobacillus sakei that is aputative histidine kinase (Accession No. AAD10268;), about 44% identityfrom amino acids 1-479 with a protein from Lactobacillus plantarum thatis a histidine kinase sensor protein (Accession No. CAD64795), and about41% homology from amino acids 1-474 with a protein from Enterococcusfaecalis that is a sensor histidine kinase (Accession No. NP_(—)814984).

A Gapped BlastP sequence alignment showed that SEQ ID NO:16 (367 aminoacids) has about 27% identity from amino acids 10-363 with a proteinfrom Oenococcus oeni that is a COG2199: FOG: GGDEF domain (Accession No.ZP_(—)00069778), about 32% identity from amino acids 114-366 with aprotein from Listeria monocytogenes that is similar to unknown proteins(hypothetical sensory transduction histidine kinase) (Accession No.NP_(—)465435), about 30% identity from amino acids 114-366 with aprotein from Listeria innocua that is a hypothetical sensorytransduction histidine kinase (Accession No. NP_(—)471359), about 38%identity from amino acids 200-366 with a protein from Leuconostocmesenteroides that is a COG2199: FOG: GGDEF domain (Accession No.ZP_(—)00062660), and about 33% identity with a protein from Vibriovulnificus that is a GGDEF family protein (Accession No. NP_(—)936516).

A Gapped BlastP sequence alignment showed that SEQ ID NO:18 (236 aminoacids) has about 33% identity from amino acids 12-228 with a proteinfrom Leuconostoc mesenteroides that is a COG2200: FOG: EAL domain(Accession No. ZP_(—)00062661), about 33% identity from amino acids12-223 with a protein from Leuconostoc mesenteroides that is a COG2200:FOG: EAL domain (Accession No. ZP_(—)00062662), about 28% identity fromamino acids 12-224 with a protein from Lactococcus lactis that is ahypothetical protein (Accession No. CAA04442), about 26% identity fromamino acids 6-228 with a protein from Listeria monocytogenes that islmo0111 (Accession No. NP_(—)463644), and about 26% identity from aminoacids 8-228 with a protein from Listeria innocua that is lin0158(Accession No. NP_(—)469503).

A Gapped BlastP sequence alignment showed that SEQ ID NO:20 (427 aminoacids) has about 59% identity from amino acids 4-427 with a protein fromLactobacillus gasseri that is a signal transduction histidine kinase(Accession No. ZP_(—)00046476), about 62% identity from amino acids38-427 with a protein from Lactobacillus johnsonii that is atwo-component system histidine kinase (Accession No. NP_(—)965390),about 37% identity from amino acids 4-421 with an unknown protein fromStreptococcus algalactiae (Accession No. NP_(—)735834), about 37%identity from amino acids 4-421 with a protein from Streptococcusalgalactiae that is sensor histidine kinase (Accession No.NP_(—)688325), and about 37% identity from amino acids 4-423 with aprotein from Streptococcus mutans that is a putative histidine kinase(Accession No. NP_(—)721328).

A Gapped BlastP sequence alignment showed that SEQ ID NO:22 (221 aminoacids) has about 77% identity from amino acids 1-220 with a protein fromLactobacillus johnsonii that is a two-component response regulator(Accession No. NP_(—)965391), about 77% identity from amino acids 1-220with proteins from Lactobacillus gasseri that are response regulatorsconsisting of a CheY-like receiver domain and a winged-helix DNA-bindingdomain (Accession No. ZP_(—)00046475), about 59% identity from aminoacids 1-221 with a protein from Streptococcus pyogenes that is aputative two-component response regulator (Accession No. NP_(—)269073),about 57% identity from amino acids 1-221 with an unknown protein fromStreptococcus agalactiae (Accession No. NP_(—)735835), and about 58%identity from amino acids 1-221 with a protein from Streptococcusagalactiae that is a DNA-binding response regulator (Accession No.NP_(—)688326).

A Gapped BlastP sequence alignment showed that SEQ ID NO:24 (525 aminoacids) has about 53% identity from amino acids 1-502 with a protein fromLactobacillus johnsonii that is a two-component system histidine kinase(Accession No. NP_(—)965436), about 55% identity from amino acids100-509 with a protein from Lactobacillus gasseri that is a signaltransduction histidine kinase (Accession No. ZP_(—)00047348), about 39%identity from amino acids 52-518 with a protein from Lactobacillusplantarum that is a histidine protein kinase sensor protein (AccessionNo. NP_(—)785147), about 39% identity from amino acids 12-500 with aprotein from Enterococcus faecalis that is a sensor histidine kinase(Accession No. NP_(—)814784), and about 47% identity from amino acids211-507 with a protein from Leuconostoc mesenteroides that is a signaltransduction histidine kinase (Accession No. ZP_(—)00063323).

A Gapped BlastP sequence alignment showed that SEQ ID NO:26 (238 aminoacids) has about 74% identity from amino acids 1-237 with a protein fromLactobacillus johnsonii that is a two-component regulatory systemresponse regulator (Accession No. NP_(—)965437), about 61% identity fromamino acids 1-237 with a protein from Lactobacillus gasseri that areresponse regulators consisting of a CheY-like receiver domain and awinged-helix DNA-binding domain (Accession No. ZP_(—)00047347), about63% identity from amino acids 1-231 with a protein from Lactobacillusplantarum that is a response regulator (Accession No. NP_(—)785146),about 61% identity from amino acids 1-231 with a protein fromEnterococcus faecalis that is a DNA-binding response regulator(Accession No. 814783), and about 59% identity from amino acids 1-231with a protein from Listeria innocua that is a two-component responseregulator (Accession No. NP_(—)470750).

A Gapped BlastP sequence alignment showed that SEQ ID NO:28 (247 aminoacids) has about 73% identity from amino acids 21-247 with a proteinfrom Lactobacillus johnsonii that is a two-component system responseregulator (Accession No. NP_(—)964988), about 48% identity from aminoacids 21-241 with a protein from Clostridium tetani that is atranscriptional regulatory protein (Accession No. NP_(—)781768), about47% identity from amino acids 22-245 with a protein from Lactobacillusplantarum that is a response regulator (Accession No. NP_(—)784099),about 48% identity from amino acids 21-247 with proteins fromThermobacter tengcongensis that are response regulators consisting of aCheY-like receiver domain and a HTH DNA_binding domain (Accession No.NP_(—)622667), and about 46% identity from amino acids 21-241 with aprotein from Clostridium acetobutylicum that is a response regulator(Accession No. NP_(—)348326).

A Gapped BlastP sequence alignment showed that SEQ ID NO:30 (441 aminoacids) has about 49% identity from amino acids 2-439 with a protein fromLactobacillus johnsonii that is a two-component system histidine kinase(Accession No. NP_(—)964989), about 32% identity from amino acids 3-434with a protein from Lactococcus lactis that is a sensor protein kinase(Accession No. NP_(—)267160), about 32% identity from amino acids 2-434with a protein from Lactococcus lactis that is a histidine kinase(Accession No. AAC45387), about 31% identity from amino acids 3-438 witha protein from Oenococcus oeni that is signal transduction histidinekinase (Accession No. ZP_(—)00069020), and 36% identity from amino acids79-437 with a protein from Lactobacillus plantarum that is a histidineprotein kinase sensor protein (Accession No. NP_(—)784098).

A Gapped BlastP sequence alignment showed that SEQ ID NO:32 (274 aminoacids) has about 55% identity from amino acids 11-274 with a proteinfrom Lactobacillus johnsonii that is a lactacin F two-component systemresponse regulator (Accession No. NP_(—)964619), about 38% identity fromamino acids 9-268 with a protein from Lactobacillus salivarius that isAbpR (Accession No. AAM61783), about 32% identity from amino acids 9-262with a protein from Lactobacillus johnsonii that is a two-componentsystem response regulator (Accession No. NP_(—)964474), about 47%identity from amino acids 140-274 with a protein from Lactobacillusjohnsonii that is a lactacin F two-component system response regulator(Accession No. 964627), and 33% identity from amino acids 12-262 with aprotein from Lactobacillus sakei that is a response regulator (AccessionNo. CAA86945).

A Gapped BlastP sequence alignment showed that SEQ ID NO:34 (440 aminoacids) has about 39% identity from amino acids 2-435 with a protein fromLactobacillus johnsonii that is a lactacin F two-component systemhistidine kinase (Accession No. NP_(—)964617), about 31% identity fromamino acids 58-431 with a protein from Lactobacillus salivarius that isAbpK (Accession No. AAM61782), about 31% identity from amino acids73-431 with a protein from Lactobacillus johnsonii that is atwo-component histidine kinase (Accession No. NP_(—)964473), about 24%identity from amino acids 59-418 with a protein from Carnobacteriumpiscicola that is a histidine protein kinase (Accession No. AAB81305),and about 25% identity from amino acids 59-412 with a protein fromCarnobacterium piscicola that is a histidine kinase CbaK (Accession No.AAF18146).

A Gapped BlastP sequence alignment showed that SEQ ID NO:36 (381 aminoacids) has about 63% identity from amino acids 1-381 with a protein fromLactobacillus gasseri that is a signal transduction histidine kinase(Accession No. ZP_(—)00046636), about 63% identity from amino acids1-381 with a protein from Lactobacillus johnsonii that is atwo-component system histidine kinase (Accession No. NP_(—)965691),about 52% identity from amino acids 4-375 with a protein fromLactobacillus sakei that is a putative histidine kinase (Accession No.AAD10266), about 53% identity from amino acids 6-375 with a protein fromLactobacillus plantarum that is a histidine kinase sensor protein(Accession No. NP_(—)786468), and 52% identity from amino acids 2-379with a protein from Enterococcus facium that is a signal transductionhistidine kinase (Accession No. ZP_(—)00036366).

A Gapped BlastP sequence alignment showed that SEQ ID NO:38 (228 aminoacids) has about 89% identity from amino acids 1-228 with proteins fromLactobacillus gasseri that are response regulators consisting of aCheY-like receiver domain and a winged-helix DNA-binding domain(Accession No. ZP_(—)00046635), about 85% identity from amino acids1-227 with a protein from Lactobacillus sakei that is a putativeresponse regulator (Accession No. AAD10265), about 81% identity fromamino acids 1-228 with a protein from Lactobacillus plantarum that is aresponse regulator (Accession No. NP_(—)786469), about 80% identity fromamino acids 2-227 with proteins from Oenococcus onei that are responseregulators consisting of a CheY-like receiver domain and a winged-helixDNA-binding domain (Accession No. ZP_(—)00069111), and about 80%identity from amino acids 1-228 with a protein from Enterococcusfaecalis that is a DNA-binding response regulator (Accession No.NP_(—)816885).

A Gapped BlastP sequence alignment showed that SEQ ID NO:40 (254 aminoacids) has about 42% identity from amino acids 3-254 with a protein fromLactobacillus johnsonii that is a hypothetical protein LJ0802 (AccessionNo. NP_(—)964657), about 42% identity from amino acids 3-254 withproteins from Lactobacillus gasseri that are transcriptional regulatorsof sugar metabolism (Accession No. ZP_(—)00046400), about 30% identityfrom amino acids 1-239 with a protein from Listeria monocytogenes thatis similar to a transcriptional regulator (DeoR) (Accession No.NP_(—)465631), about 30% identity from amino acids 1-231 with a proteinfrom Oceanobacillus iheyensis that is a transcriptional repressor of thephosphotransferase system (Accession No. NP_(—)693730), and about 28%identity from amino acids 1-239 with a protein from Listeria innocuathat is similar to a transcriptional regulator (DeoR family) (AccessionNo. NP_(—)471545).

A Gapped BlastP sequence alignment showed that SEQ ID NO:42 (805 aminoacids) has about 84% identity from amino acids 7-805 with a protein fromLactobacillus johnsonii that is a probablexylulose-5-phosphate/fructose-6-phosphste phosphoketolase (Accession No.NP_(—)964658), about 65% identity from amino acids 7-805 with a proteinfrom Lactobacillus plantarum that is a phosphoketolase (Accession No.NP_(—)786060), about 65% identity from amino acids 7-805 with a proteinfrom Lactobacillus pentosus that is similar to a phosphoketolase(Accession No. CAC84393), about 65% identity from amino acids 6-805 witha protein from Oenococcus onei that is a phosphoketolase (Accession No.ZP_(—)00069369), and 65% identity from amino acids 7-805 with a proteinfrom Lactobacillus paraplantarum that is a xylulose-5-phosphatephosphoketolase (Accession No. AAQ64626).

A Gapped BlastP sequence alignment showed that SEQ ID NO:44 (286 aminoacids) has about 59% identity from amino acids 5-286 with a protein fromLactobacillus johnsonii that is a hypothetical protein LJ0785 (AccessionNo. NP_(—)964640), about 59% identity from amino acids 5-286 with aprotein from Lactobacillus gasseri that is a predicted esterase of thealpha-beta hydrolase superfamily (Accession No. ZP_(—)00045972), about41% identity from amino acids 5-284 with a protein from Fasobacteriumnucleatum that is a serine protease (Accession No. ZP_(—)00143830),about 41% identity from amino acids 5-284 with a protein fromFasobacterium nucleatum that is a Serine protease (Accession No.NP_(—)603405), and 41% identity from amino acids 5-284 with a proteinfrom Streptococcus agalactiae that is a protein of unknown function(Accession No. NP_(—)689045).

A Gapped BlastP sequence alignment showed that SEQ ID NO:46 (402 aminoacids) has about 35% identity from amino acids 74-387 with a proteinfrom Lactobacillus gasseri that is a predicted metal-dependent membraneprotease (Accession No. ZP_(—)00046861), about 29% identity from aminoacids 1-389 with a protein from Lactobacillus johnsonii that is ahypothetical protein LJ1642 (Accession No. NP_(—)965449), about 26%identity from amino acids 113-392 with a protein from Lactobacillusplantarum that is a CAAX family membrane-bound protease (Accession No.NP_(—)786255), about 27% identity from amino acids 90-383 with ahypothetical protein from Lactobacillus gasseri (Accession No.ZP_(—)00047041), and 23% identity from amino acids 104-389 with aprotein from Lactobacillus johnsonii that is hypothetical protein LJ0777(Accession No. NP_(—)964632).

A Gapped BlastP sequence alignment showed that SEQ ID NO:48 (224 aminoacids) has about 31% identity from amino acids 64-222 with proteins fromMethanosarcina barkeri that are flavodoxins (Accession No.ZP_(—)00079137), about 27% identity from amino acids 37-224 withproteins from Leuconostoc mesenteroides that are flavodoxins (AccessionNo. ZP_(—)00062708), about 29% identity from amino acids 64-222 with aprotein from Porphyromonas gingivalis that is a putative flavodoxin(Accession No. NP_(—)905330), about 29% identity from amino acids 63-224with a protein from Azobacter vinelandii that is a flavodoxin (AccessionNo. ZP_(—)00092501), and about 33% identity from amino acids 74-224 witha protein from Methanosarcina barkeri that is a flavodoxin (AccessionNo. ZP_(—)00079128).

A Gapped BlastP sequence alignment showed that SEQ ID NO:50 (293 aminoacids) has about 85% identity from amino acids 20-287 with a proteinfrom Lactobacillus johnsonii that is a hypothetical protein LJ1250(Accession No. NP_(—)965105), about 83% identity from amino acids 22-291with proteins from Lactobacillus gasseri that are membrane proteasesubunits, stomatin/prohibitin homologs (Accession No. ZP_(—)00045910),about 60% identity from amino acids 22-286 with proteins fromLeuconostoc mesenteroides that are membrane protease subunits,stomatin/prohibitin homologs (Accession No. ZP_(—)00063597), about 57%identity from amino acids 22-289 with proteins from Oenococcus oeni thatare membrane protease subunits, stomatin/prohibitin homologs (AccessionNo. ZP_(—)00069250), and about 43% identity from amino acids 23-283 withan unknown protein from Lactobacillus plantarum (Accession No.NP_(—)784144).

A Gapped BlastP sequence alignment showed that SEQ ID NO:52 (105 aminoacids) has about 30% identity from amino acids 7-100 with a hypotheticalprotein from Streptococcus pyogenes (Accession No. NP_(—)664197), about30% identity from amino acids 7-100 with a hypothetical protein fromStreptococcus pyogenes (Accession No. NP_(—)606807), about 34% identityfrom amino acids 3-72 with a hypothetical protein from Streptococcuspyogenes (Accession No. NP_(—)268822), about 40% identity from aminoacids 8-60 with a protein from Treponema denticola that is a putativeDNA-damage-inducible protein J (Accession No. NP_(—)971120), and about30% identity from amino acids 3-58 with a protein fromDesulfitobacterium hafniense that is a DNA-damage-inducible protein J(Accession No. ZP_(—)00099746).

A Gapped BlastP sequence alignment showed that SEQ ID NO:54 (325 aminoacids) has about 42% identity from amino acids 1-323 with a hypotheticalprotein from Lactobacillus gasseri (Accession No. ZP_(—)00047284), about41% identity from amino acids 9-323 with a hypothetical protein LJ0696from Lactobacillus johnsonii (Accession No. NP_(—)964548), about 35%identity from amino acids 17-322 with a protein from Lactobacillushelveticus that is a helveticin (Accession No. AAA63274), about 24%identity from amino acids 116-258 with a protein from Rattus norvgicusthat is similar to Gli3 protein (Accession No. XP_(—)225411), and about21% identity from amino acids 153-289 with a protein from Saccharomycescerevisiae that is Tom1p (Accession No. NP_(—)010745).

A Gapped BlastP sequence alignment showed that SEQ ID NO:56 (272 aminoacids) has about 38% identity from amino acids 3-272 with proteins fromLactobacillus gasseri that are predicted hydrolases of the HADsuperfamily (Accession No. ZP_(—)00046918), about 36% identity fromamino acids 7-270 with a protein from Streptococcus mutans that is aconserved hypothetical protein (Accession No. NP_(—)721496), about 34%identity from amino acids 7-272 with a protein from Streptococcusagalactiae that is unknown (Accession No. NP_(—)735618), about 33%identity from amino acids 7-272 with a protein from Streptococcusagalactiae that is a haloacid dehalogenase-like family hydrolase(Accession No. AAM99986), and about 33% identity from amino acids 7-272with a protein from Listeria innocua that is a conserved hypotheticalprotein lin0440 (Accession No. NP_(—)469785).

A Gapped BlastP sequence alignment showed that SEQ ID NO:58 (146 aminoacids) has about 35% identity from amino acids 26-145 with proteins fromLactobacillus gasseri that are transcriptional regulators (Accession No.ZP_(—)00045996), about 35% identity from amino acids 28-142 with aprotein from Lactococcus lactis that is a transcriptional regulator(Accession No. NP_(—)267638), about 31% identity from amino acids 29-142with a protein from Clostridium acetobutylicum that is a MarR/EmrRfamily transcriptional regulator (Accession No. NP_(—)349100), about 34%identity from amino acids 14-104 with a protein from Methanothermobacterthermautotrophicus that is a transcription regulator (Accession No.NP_(—)275456), and about 27% identity from amino acids 14-142 with aprotein from Staphylococcus aureus that is a hypothetical protein(Accession No. NP_(—)370857).

A Gapped BlastP sequence alignment showed that SEQ ID NO:60 (585 aminoacids) has about 57% identity from amino acids 16-582 with a proteinfrom Lactobacillus brevis that is a Hop-resistant MDR (multidrugresistance)-like gene (Accession No. BAA21552), about 57% identity fromamino acids 16-584 with a protein from Lactobacillus plantarum that is amultidrug ABC transporter ATP-binding and permease protein (AccessionNo. NP_(—)786297), about 51% identity from amino acids 11-582 with aprotein from Lactococcus lactis that is a multidrug resistance proteinLmrA (Accession No. AAB49750), about 51% identity from amino acids11-582 with a protein from Lactococcus lactis that is a multidrugresistance ABC transporter ATP-binding and permease protein (AccessionNo. Q9CHL8), and about 51% identity from amino acids 11-582 with aprotein from Lactococcus lactis that is a multidrug resistance ABCtransporter ATP-binding and permease protein (Accession No.NP_(—)266867).

A Gapped BlastP sequence alignment showed that SEQ ID NO:62 (118 aminoacids) has about 33% identity from amino acids 4-115 with a protein fromLactobacillus gasseri that is a hypothetical protein (Accession No.ZP_(—)00046399), about 26% identity from amino acids 29-115 with aprotein from Carnobacterium divergens that is dvnl (Accession No.CAA11807), about 26% identity from amino acids 29-115 with a proteinfrom Lactobacillus plantarum that is a bacteriocin immunity protein(Accession No. NP_(—)786516), about 38% identity from amino acids 74-117with a protein from Equine coronavirus NC99 that is a spike protein(Accession No. AAQ67205), and about 25% identity from amino acids 20-115with a protein from Clostridium acetobutylicum that is anuncharacterized protein similar to the mesC/lccI/entI family bacteriocinimmunity protein (Accession No. NP_(—)149170).

A Gapped BlastP sequence alignment showed that SEQ ID NO:64 (505 aminoacids) has about 27% identity from amino acids 3-481 with a protein fromThermoanaerobacter tengcongensis that is aminopeptidase N (Accession No.NP_(—)624209), about 33% identity from amino acids 122-369 with aprotein from Streptomyces avermitilis that is a putativemetallopeptidase (Accession No. NP_(—)821429), about 31% identity fromamino acids 122-371 with a protein from Streptomyces coelicolor that isa putative metallopeptidase (Accession No. NP_(—)631646), about 24%identity from amino acids 11-480 with a protein from Chloroflexusauranticus that is a hypothetical protein (Accession No.ZP_(—)00017564), and about 23% identity from amino acids 282-499 with aprotein from Xylella fastidiosa that is aminopeptidase N (Accession No.ZP_(—)00042138).

A Gapped BlastP sequence alignment showed that SEQ ID NO:66 (353 aminoacids) has about 22% identity from amino acids 128-344 with proteinsfrom Haemophilus somnus that are proteins involved in heme utilization(Accession No. ZP_(—)00133280), and about 21% identity with a proteinfrom Homo sapiens that is unknown (Accession No. AAH62424)

A Gapped BlastP sequence alignment showed that SEQ ID NO:68 (201 aminoacids) has about 27% identity from amino acids 1-134 with a protein fromXylella fastidiosa that is a transposase and inactivated derivatives(Accession No. ZP_(—)00038374), about 58% identity from amino acids159-201 with a protein from Lactobacillus delbrueckii that is atransposase for insertion sequence element (Accession No. AAQ06905),about 26% identity from amino acids 1-134 with a protein from Xylellafastidiosa that is a transposase and inactivated derivatives (AccessionNo. ZP_(—)00038149), about 24% identity from amino acids 27-196 with aprotein from Nostoc sp. that is a transposase (Accession No.NP_(—)490351), and about 25% identity from amino acids 1-132 with aprotein from Xylella fastidiosa that is a transposase and inactivatedderivatives (Accession No. ZP_(—)00038301).

A Gapped BlastP sequence alignment showed that SEQ ID NO:70 (180 aminoacids) has about 67% identity from amino acids 1-138 with a protein fromLactobacillus debruekii that is a transposase for insertion sequenceelement (Accession No. AAQ06905), about 36% identity from amino acids2-179 with a protein from Clostridium perfringens that is a probabletransposase (Accession No. NP_(—)561584), about 36% identity from aminoacids 2-178 with a protein from Clostridium tetani that is a transposase(Accession No. NP_(—)781063), about 36% identity from amino acids 2-179with a protein from Clostridium perfringens that is a probabletransposase (Accession No. NP_(—)562803), and about 35% identity fromamino acids 2-179 with a protein from Clostridium tetani that is atransposase (Accession No. AAO35235).

A Gapped BlastP sequence alignment showed that SEQ ID NO:72 (444 aminoacids) has about 55% identity from amino acids 1-432 with a protein fromLactobacillus plantarum that is a cation efflux protein (Accession No.NP_(—)783937), about 42% identity from amino acids 2-432 with a proteinfrom Bifidobacterium longum that is a Na⁺-driven multidrug efflux pump(Accession No. ZP_(—)00120269), about 33% identity from amino acids3-421 with a protein from Clostridium tetani that is a Na⁺-drivenmultidrug efflux pump (Accession No. NP_(—)781116), about 31% identityfrom amino acids 3-431 with a protein from Methanosarcina acetivoransthat is an integral membrane protein (Accession No. NP_(—)616062), andabout 29% identity from amino acids 7-432 with a protein fromClostridium acetobutlycum that is a predicted membrane protein andprobable cation efflux pump (MDR-type) (Accession No. NP_(—)349099).

A Gapped BlastP sequence alignment showed that SEQ ID NO:74 (64 aminoacids) has about 28% identity from amino acids 4-49 with a protein fromNostoc sp. that is a hypothetical protein (Accession No. NP_(—)478212).

A Gapped BlastP sequence alignment showed that SEQ ID NO:76 (63 aminoacids) has about 40% identity from amino acids 9-39 with a protein fromBacillus subtilis that is an assimilatory nitrate reductase (AccessionNo. NP_(—)388214), and about 40% identity from amino acids 9-41 with aprotein from Bacillus subtilis that is an assimilatory nitrite reductase(Accession No. NP_(—)388212).

A Gapped BlastP sequence alignment showed that SEQ ID NO:78 (438 aminoacids) has about 40% identity from amino acids 66-188 with a proteinfrom Lactobacillus salavarius that is unknown (Accession No. AAM61773),about 28% identity from amino acids 4-297 with a protein fromStreptococcus mutans that is a hypothetical protein (Accession No.NP_(—)722210), about 26% identity from amino acids 101-220 with aprotein from Streptococcus agalactiae that is a putative bacteriocintransport accessory protein (Accession No. NP_(—)687482), about 27%identity from amino acids 86-216 with a protein from Brochothrixcampestris that is a transport accessory protein (Accession No.AAC95141), and about 25% identity from amino acids 101-220 with aprotein from Streptococcus agalactiae that is unknown (Accession No.NP_(—)734963).

A Gapped BlastP sequence alignment showed that SEQ ID NO:80 (196 aminoacids) has about 56% identity from amino acids 1-196 with a protein fromLactobacillus gasseri that is a putative gassericin K7 B accessoryprotein (Accession No. AAP73779), about 56% identity from amino acids1-196 with a protein from Lactobacillus gasseri that is ORF2 (AccessionNo. BAA82351), about 55% identity from amino acids 10-196 with a proteinfrom Lactobacillus gasseri that is unknown (Accession No. AAP56342),about 49% identity from amino acids 10-196 with a protein fromLactobacillus sp. that is a hypothetical protein in the LAF 5′ region(ORF1) (Accession No. AAA16635), and about 28% identity from amino acids41-195 with a protein from Lactobacillus casei that is anABC-transporter accessory factor (Accession No. NP_(—)542220).

A Gapped BlastP sequence alignment showed that SEQ ID NO:82 (720 aminoacids) has about 68% identity from amino acids 1-720 with a protein fromLactobacillus salivarius that is AbpT (Accession No. AAM61785), about62% identity from amino acids 9-720 with a protein from Lactobacillusplantarum that is an ATP-binding and permease protein PlnG bacteriocinABC-transporter (Accession No. NP_(—)784218), about 62% identity fromamino acids 9-720 with a protein from Lactobacillus plantarum that isthe ABC-transporter PlnG (Accession No. CAA64189), about 62% identityfrom amino acids 6-720 with a protein from Lactobacillus sakei that isthe probable ATP-dependent translocation protein sppT (Accession No.AAA16635), and about 62% identity from amino acids 2-720 with a proteinfrom Lactobacillus sakei that is an ABC-exporter (Accession No.CAA86946).

A Gapped BlastP sequence alignment showed that SEQ ID NO:84 (83 aminoacids) has about 100% identity from amino acids 20-42 with a proteinfrom Lactobacillus acidophilus that is the acidocin J1132 alpha peptide(N-terminal) (Accession No. AAB49523), and about 100% identity fromamino acids 19-42 with a protein from Lactobacillus acidophilus that isthe acidocin J1132 beta peptide (Accession No. AAB49524).

A Gapped BlastP sequence alignment showed that SEQ ID NO:94 (208 aminoacids) has about 25% identity from amino acids 23-125 with ahypothetical protein from Lactobacillus helveticus (Accession No.CAA57507).

A Gapped BlastP sequence alignment showed that SEQ ID NO:98 (197 aminoacids) has about 35% identity from amino acids 3-196 with a protein fromLactobacillus gasseri that is a predicted metal-dependent membraneprotease (Accession No. ZP_(—)00046861), about 38% identity from aminoacids 1-151 with a protein from Lactobacillus gasseri that is ahypothetical protein (Accession No. ZP_(—)00047041), about 26% identityfrom amino acids 3-183 with a protein from Lactobacillus plantarum thatis a CAAX family membrane-bound protease (Accession No. NP_(—)786255),about 30% identity from amino acids 1-142 with a protein fromLactobacillus gasseri that is a predicted metal-dependent membraneprotease (Accession No. ZP_(—)00047281), and about 35% identity fromamino acids 80-156 with a protein from Lactobacillus plantarum that isthe CAAX family membrane-bound protease immunity protein PlnI (AccessionNo. NP_(—)784215).

A Gapped BlastP sequence alignment showed that SEQ ID NO:100 (263 aminoacids) has about 23% identity from amino acids 57-263 with a proteinfrom Lactobacillus gasseri that is a hypothetical protein (Accession No.ZP_(—)00047041), about 33% identity from amino acids 134-201 with aprotein from Halobacterium sp. that is the 3-oxoacyl-[acyl-carrierprotein] reductase FabG (Accession No. NP_(—)280196), about 29% identityfrom amino acids 62-245 with a protein from Lactobacillus gasseri thatis a prediceted metal-dependent protease (Accession No. ZP_(—)00046861),about 30% identity from amino acids 26-109 with a protein fromPlasmodium falciparum that is a conserved hypothetical protein(Accession No. NP_(—)701942), and about 26% identity from amino acids83-229 with a protein from Avian infectious prochitis virus that is thereplicase polyprotein lab (Accession No. AAP92673).

A Gapped BlastP sequence alignment showed that SEQ ID NO:102 (398 aminoacids) has about 30% identity from amino acids 6-396 with a protein fromLactobacillus gasseri that is a predicted metal-dependent membraneprotease (Accession No. ZP_(—)00046861), about 27% identity from aminoacids 4-392 with a protein from Lactobacillus gasseri that ishypothetical protein (Accession No. ZP_(—)00047041), about 30% identityfrom amino acids 201-381 with a protein from Lactobacillus gasseri thatis a predicted metal-dependent membrane protease (Accession No.ZP_(—)00047281), about 24% identity from amino acids 103-394 with aprotein from Lactobacillus plantarum that is a CAAX familymembrane-bound protease (Accession No. NP_(—)786255), and about 35%identity from amino acids 256-360 with a protein from Lactobacillusplantarum that is the CAAX family membrane-bound protease immunityprotein PlnP (Accession No. NP_(—)784209).

A Gapped BlastP sequence alignment showed that SEQ ID NO:104 (103 aminoacids) has about 48% identity from amino acids 51-83 with a protein fromPyrococcus abyssi that is a hypothetical molybdenum cofactor (AccessionNo. NP_(—)126386), about 28% identity from amino acids 1-92 with aprotein from Dictyostelium discoideum that is a vacuolar proton ATPase100 kDa subunit (Accession No. AAB49621), about 28% identity from aminoacids 34-102 with a protein from Agrobacterium tumefaciens that is aconserved hypothetical protein (Accession No. NP_(—)535397), about 40%identity from amino acids 53-94 with a protein from Ralstoniasolanacearum that is a putative hemaglutanin-related protein (AccessionNo. NP_(—)521309), and about 44% identity from amino acids 48-76 with aprotein from Lactobacillus gasseri that is ORF3 (Accession No.BAA82352).

A Gapped BlastP sequence alignment showed that SEQ ID NO:106 (767 aminoacids) has about 71% identity from amino acids 3-766 with proteins fromLactobacillus gasseri that is that are alpha-glucosidases (Accession No.ZP_(—)00046641), about 65% identity from amino acids 5-761 with aprotein from Lactobacillus plantarum that is an alpha-glucosidase(Accession No. NP_(—)621719), about 40% identity from amino acids 15-767with a protein from Thermoanaerobacter tengcongensis that is analpha-glucosidase (Accession No. NP_(—)535397), about 40% identity fromamino acids 20-717 with a protein from Bacillus thermoamyloliquefaciensthat is alpha-glucosidase II (Accession No. Q9F234), and about 38%identity from amino acids 10-750 with proteins from Nostoc punctiformethat are alpha-glucosidases (Accession No. ZP_(—)00110705).

A Gapped BlastP sequence alignment showed that SEQ ID NO:116 (249 aminoacids) has about 90% identity from amino acids 1-249 with a protein fromLactobacillus gasseri that is an aspartate racemase (Accession No.ZP_(—)00046638), about 87% identity from amino acids 1-249 with aprotein from Lactobacillus johnsonii that is an aspartate racemace(Accession No. NP_(—)965689), about 52% identity from amino acids 1-234with a protein from Pediococcus pentosaceus that is an aspartateracemance (Accession No. CAA43598), and about 48% identity from aminoacids 1-235 with a protein from Streptococcus thermophilus that is anaspartate racemace (ZP00285115).

A Gapped BlastP sequence alignment showed that SEQ ID NO:118 (523 aminoacids) has about 85% identity from amino acids 2-519 with a protein fromLactobacilllus johnsonii that is aUDP-N-acetylmuramoyl-L-alanyl-D-glutamate lysine ligase (Accession No.NP_(—)965690), about 85% identity from amino acids 2-519 with a proteinfrom Lactobaillus johnsonii that is a UDP-N-acetylmuramyl tripeptidesynthase (Accession No. ZP_(—)00046637), about 52% identity from aminoacids 1-510 with a protein from Pediococcus pentosaceus that is aUDP-N-acetylmuramyl tripeptide synthase (Accession No. ZP_(—)00323229),and about 45% identity from amino acids 1-515 with a protein fromLeuconostoc mesenteroides that is a UDP-N-acetylmuramyl tripeptidesynthase (Accession No. ZP_(—)00062837).

A Gapped BlastP sequence alignment showed that SEQ ID NO:120 (621 aminoacids) has about 84% identity from amino acids 7-620 with a protein fromLactobacillus johnsonii that are ABC transporter ATPase and permeasecomponents (Accession No. NP_(—)965693), about 82% identity from aminoacids 10-620 with a protein from Lactobacillus gasseri that are ABC-typemultidrug transport system, ATPase and permease components (AccessionNo. ZP_(—)00046634), about 52% identity with a protein from Clostridiumacetobutylicum that is an ABC-type multidrug/protein/lipid transportsystem, ATPase component (Accession No. NP_(—)350005), and about 52%identity from amino acids 40-621 with a protein from Desulfitobacteriumhafniense that are ABC-type multidrug transport system, APTase andpermease components (Accession No. ZP_(—)00099385). A Gapped BlastPsequence alignment showed that SEQ ID NO:122 (576 amino acids) has about83% identity from amino acids 1-576 with a protein from Lactobacillusgasseri that are ABC-type multidrug transport system ATPase and permeasecomponents (Accession No. ZP_(—)00046633), about 83% identity from aminoacids 1-576 with a protein from Lactobacillus johnsonii that are ABCtransporter ATPase and permease components (Accession No. NP_(—)965694),about 51% identity from amino acids 1-574 with a protein fromDesulfitobacterium hafniense that are ABC-type multidrug transportsystem ATPase and permease components (Accession No. ZP_(—)00099386),and about 50% identity from amino acids 1-569 with a protein fromBifidobacterium longum that is an ATP-binding protein of an ABCtransporter (Accession No. NP_(—)696913).

A Gapped BlastP sequence alignment showed that SEQ ID NO:124 (452 aminoacids) has about 40% identity from amino acids 4-431 with a protein fromLactobacillus gasseri that is an uncharacterized protein conserved inbacteria (Accession No. ZP_(—)00341762), about 39% identity from aminoacids 4-431 with a protein from Lactobacillus johnsonii that is ahypothetical protein (Accession No. NP_(—)964083), about 26% identityfrom amino acids 9-427 with a protein from Lactobacillus plantarum thatis a hypothetical protein (Accession No. NP_(—)783898), and about 25%identity from amino acids 21-427 with a protein from Pediococcuspentosaceus that is an uncharacterized protein conserved in bacteria(Accession No. ZP_(—)00323558).

A Gapped BlastP sequence alignment showed that SEQ ID NO:126 (274 aminoacids) has about 42% identity from amino acids 1-268 with a protein fromLactobacilus johnsonii that is a hypothetical protein (Accession No.NP_(—)964084), about 40% identity from amino acids 1-268 with a proteinfrom Lactobacillus gasseri that is an uncharacterized protein conservedin bacteria (Accession No. ZP_(—)0046801), about 33% identity from aminoacids 1-269 with a protein from Lactobacillus plantarum that is ahypothetical protein (Accession No. NP_(—)783899), and about 27%identity from amino acids 1-266 with a protein from Pediococcuspentosaceus that is an uncharacterized protein conserved in bacteria(Accession No. ZP_(—)00323559).

A Gapped BlastP sequence alignment showed that SEQ ID NO:128 (265 aminoacids) has about 74% identity from amino acids 1-265 with a protein fromLactobacillus gasseri that are metal-dependent hydrolases of thebeta-lactamase superfamily (Accession No. ZP_(—)00046802), about 73%identity from amino acids 1-265 with a protein from Lactobacillusjohnsonii that is a hypothetical protein (Accession No. NP_(—)964085),about 52% identity from amino acids 1-265 with a protein fromLactobacillus plantarum that is a hydrolase (Accession No.NP_(—)783900), and about 52% identity from amino acids 1-255 with aprotein from Pediococcus pentosaceus that are metal-dependent hydrolasesof the beta-lactamase superfamily (Accession No. ZP_(—)00323560).

A Gapped BlastP sequence alignment showed that SEQ ID NO:130 (423 Aminoacids) has about 86% identity from amino acids 12-423 with a proteinfrom Lactobacillus helveticus that is HtrA (Accession No. CAA06668),about 60% identity from amino acids 18-420 with a protein fromLactobacillus johnsonii that is a serine protease do-like HtrA(Accession No. NP_(—)964086), about 50% identity from amino acids 36-412with a protein from Pediococcus pentosaceus that are trypsin-like serineproteases (Accession No. ZP_(—)00323561), and about 41% identity fromamino acids 22-420 with a protein from Exiguobacterium sp. that is atrypsin-like serine protease (Accession No. ZP_(—)00184047).

A Gapped BlastP sequence alignment showed that SEQ ID NO:118 (236 aminoacids) has about 35% identity from amino acids 25 to 221 with a proteinfrom Oenococcus oeni that is an EAL domain (Accession No.ZP_(—)00319350), about 33% identity from amino acids 7 to 223 with aprotein from Leuconostocmesesenteroides subsp. mesenteroides ATCC 8293that are EAL domains (Accession No. ZP_(—)00062661), and about 33%identity with a protein from Leuconostocmesenteroides subsp.mesenteroides ATCC 8293 that is an EAL domain (Accession No.ZP_(—)00062662).

A Gapped BlastP sequence alignment showed that SEQ ID NO:132 (56 aminoacids) has about 55% identity from amino acids 1 to 56 with a proteinfrom Oenococcus oeni PSU-1 that is a NADH:flavin oxidoreductases(Accession No. ZP_(—)00318642), about 51% identity from amino acids 1 to56 with a protein from Leuconostoc mesenteroides subsp. mesenteroidesATCC 8293 that are NADH:flavin oxidoreductases (Accession No.ZP_(—)00064370), and about 50% identity from amino acids 3 to 56 withproteins from Lactococcus lactis subsp. lactis that are NADH-dependentoxidoreductase (Accession Nos. NP_(—)267851, AAK05793, and G86836).

A Gapped BlastP sequence alignment showed that SEQ ID NO:134 (184 aminoacids) has about 68% identity from amino acids 3 to 184 with a proteinfrom Oenocossus oeni PSU-1 that is an amidase related to nicotinamidase(Accession No. ZP_(—)00318699), about 63% identity from amino acids 2 to183 with a protein from Lactobacillus plantarum WCFS1 that is apyrazinamidase/nicotinamidase (Accession Nos. NP_(—)786021 andCAD64878), and about 59% identity from amino acids 2 to 182 with theprotein from Pediococcus pentosaceus ATCC 25745 that is an amidaserelated to nicotinamidase (Accession No. ZP_(—)00323805).

A Gapped BlastP sequence alignment showed that SEQ ID NO:138 (498 aminoacids) has about 83% identity from amino acids 1 to 493 with a proteinfrom Lactobacillus johnsonii NCC533 that is an amino acid transporter(Accession No. NP_(—)965275), about 82% identity from amino acids 1 to493 with a protein from Lactobacillus gasseri that is an amino acidtransporter (Accession No. ZP_(—)00046566), and about 46% identity fromamino acids 8 to 492 with the protein from Pediococcus pentosaceus ATCC25745 that is an amino acid transporter (Accession No. ZP_(—)00323277).

A Gapped BlastP sequence alignment showed that SEQ ID NO:140 (231 aminoacids) has about 46% identity from amino acids 1 to 230 with a proteinfrom Lactobacillus plantarum WCFS1 that is a cell surface hydrolase(putative) (Accession No. NP_(—)785474), about 46% identity from aminoacids 1 to 231 with a protein from Lactobacillus johnsonii NCC533 thatis a hypothetical protein LJ0748 (Accession No. NP_(—)964600), and about45% identity with the protein from Lactobacillus gasseri that is anuncharacterized protein with an alpha/beta hydrolase fold (Accession No.ZP_(—)00045991).

A Gapped BlastP sequence alignment showed that SEQ ID NO:144 (230 aminoacids) has about 27% identity from amino acids 1 to 226 with a proteinfrom Oenocossus oeni PSU-1 that is an aldo/keto reductases, related todiketogulonate reductases (Accession No. ZP_(—)00319386), about 26%identity from amino acids 4 to 226 with a protein from Bifidobacteriumlongum NCC2705 that is a morphine 6-dehydrogenate (Accession No.NP_(—)696457), and about 26% identity from amino acids 9 to 226 with aprotein from Bifidobactrium longum DJO 10A that is an aldo/ketoreductases related to diketogulonate reductase (Accession No.ZP_(—)00120718).

A Gapped BlastP sequence alignment showed that SEQ ID NO:148 (392 aminoacids) has about 68% identity from amino acids 1 to 389 with a proteinfrom Lactobacillus gasseri that is a permeases of the major facilitatorsuper family (Accession No. ZP_(—)00046919), about 68% identity fromamino acids 1 to 391 with a protein from Lactobaccillus johnsonii NCC533 that is a major facilitator super family permease (Accession No.NP_(—)965415), and about 46% identity from amino acids 1 to 385 with aprotein from Lactobacillus plantarum WCFS1 that is a multi-drugtransport protein (Accession No. NP_(—)784617).

A Gapped BlastP sequence alignment showed that SEQ ID NO:22 (221 aminoacids) has about 77% identity with a protein from Lactobacillusjohnsonii NCC 533 that is a 2-component system response regulator(Accession No. NP_(—)965391), about 77% identity from amino acids 1 to220 with a protein from Lactobacillus gasseri that is a responseregulator consisting of a CheY-like receiver domain and a winged-helixDNA-binding domain (Accession No. ZP_(—)00046475), and about 59%identity from amino acids 1 to 221 with a protein from Streptococcuspyogenes SSI-1 that are putative to component response regulators(Accession No. NP_(—)607078).

A Gapped BlastP sequence alignment showed that SEQ ID NO:108 (63 aminoacids) has about 59% identity from amino acids 1 to 63 with a proteinfrom Bacillus thuringiensis serovar konkukian str. 97-27 that is aflagellar hook-associated protein 1 (Accession No. YP_(—)035858), andabout 40% identity from amino acids 32 to 63 with a protein fromBacillus cereusZK that is a flagellar hook-associated protein 1(Accession No. YP_(—)083109).

A Gapped BlastP sequence alignment showed that SEQ ID NO:122 (576 aminoacids) has about 83% identity from amino acids 1 to 576 with a proteinfrom Lactobacillus gasseri that is an ABC-type multi-drug transportsystem, ATPace and permease components (Accession No. ZP_(—)00046633),about 83% identity from amino acids 1 to 576 with a protein fromLactobacillus johnsonii NCC 533 that is an ABC transporter ATPace andpermease components (Accession Nos. NP_(—)965694 and AAS09660), andabout 51% identity from amino acids 1 to 574 with a protein fromDesulfitobacterium hafniense DCB-2 that is an ABC-type multi-drugtransport system, ATPace and permease components (Accession No.ZP_(—)00099386).

A Gapped BlastP sequence alignment showed that SEQ ID NO:152 (260 aminoacids) has about 45% identity from amino acids 2 to 249 with a proteinfrom Lactobacillus gasseri that is an uncharacterized membrane-boundprotein conserved in bacterium (Accession No. ZP_(—)00046632).

A Gapped BlastP sequence alignment showed that SEQ ID NO:154 (366 aminoacids) has about 93% identity from amino acids 1 to 366 with a proteinfrom Lactobacillus johnsonii NCC533 that is a probable GTP-bindingprotein (Accession No. AAS09662), about 93% identity from amino acids 1to 366 with a protein from Lactobacillus gasseri that is a predictedGTPase, probable translation factor (Accession No. ZP_(—)00046631), andabout 77% identity from amino acids 1 to 366 with a protein fromPediococcus pentosaceus ATCC 25745 that is a predicted GTPase, probabletranslation factor (Accession No. ZP_(—)00322452) and 76% identity fromamino acid 1 to 366 with a protein from Lactobacillus plantarum WCFS1that is a GTP-binding protein (Accession No. NP_(—)786473).

A Gapped BlastP sequence alignment showed that SEQ ID NO:158 (294 aminoacids) has about 80% identity from amino acids 1 to 293 with a proteinfrom Lactobacillus gasseri that is a predicted transcriptional regulator(Accession No. ZP_(—)00046630), about 78% identity from amino acids 1 to293 with a protein from Lactobacillus johnsonii NCC 533 which is achromosome partitioning protein ParB (Accession No. NP_(—)965698), andabout 60% identity from amino acids 5 to 293 of a protein fromLactobacillus plantarum WCFS1 that is a chromosome partitioning protein(Accession No. NP_(—)786475) and about 59% identity from amino acids 12to 293 to a protein from Entrococcus faecalis V583 which is a chromosomepartitioning protein ParB family (Accession No. NP_(—)816893).

A Gapped BlastP sequence alignment showed that SEQ ID NO:160 (259 aminoacids) has about 85% identity from amino acids 1 to 257 to a proteinfrom Lactobacillus johnsonii NCC533 that is a chromosome partitioningprotein ParA (Accession No. NP_(—)965699), about 85% identity from aminoacids 1 to 257 to a protein from Lactobacillus gasseri that is anATPases involved in chromosome partitioning (Accession No.ZP_(—)00046629), and permease components (Accession No. ZP_(—)00046629),and about 68% identity from amino acids 1 to 251 of a protein fromenterococcus faecalis V583 that is an ATPase, ParA family (Accession No.NP_(—)816894).

A Gapped BlastP sequence alignment showed that SEQ ID NO:162 (276 aminoacids) has about 57% identity from amino acids 1 to 276 with a proteinfrom Lactobacillus johnsonii NCC533 which is a probable chromosomepartitioning protein ParB (Accession No. NP_(—)96570), about 58%identity from amino acids 1 to 276 of a protein from Lactobacillusgasseri that is a predicted transcriptional regulator (Accession No.ZP_(—)00046628), and about 50% identity from amino acids 14 to 275 of aprotein from Lactobacillus plantarum WCFS1 that is a chromosomepartitioning protein, DNA binding protein (Accession No. NP_(—)786477)and about 50% identity from amino acids 19 to 276 with a protein fromGeobacillus kaustophilus HTA426 which is a hypothetical protein GK3491(Accession No. YP_(—)149344).

A Gapped BlastP sequence alignment showed that SEQ ID NO:164 (240 aminoacids) has about 67% identity from amino acids 1 to 239 with a proteinfrom Lactobacillus johnsonii NCC533 which is a glucose inhibiteddivision protein B (Accession No. NP_(—)965701), and about 66% identityfrom amino acids 1 to 239 to a protein from Lactobacillus gasseri thatis a predicted S-adenosylmethionine-dependent methyltransferase involvedin bacterial cell division (Accession No. ZP_(—)00046627), and about 62%identity from amino acids 1 to 239 a protein from Pediococcuspentosaceus ATCC 25745 that is a predictedS-adenosylmethionine-dependent methyltransferase involved in bacterialcell division (Accession No. ZP_(—)00322449).

Example 3 PFAM Results for Amino Acid Sequences

SEQ ID NO:2 contains a predicted Response_reg domain located from aboutamino acids 3 to 92 and a predicted Trans_reg_C domain located fromabout amino acids 84 to 225, and is a member of the Response regulatorreceiver domain family (Response_reg) (PFAM Accession PF00072) and amember of the Transcriptional regulatory protein C family (Trans_reg_C)(PFAM Accession PF00486).

SEQ ID NO:4 contains a predicted HAMP domain from about amino acids 184to 253, a predicted HisKA domain located from about amino acids 376 to443 and a predicted HATPase_c domain from about amino acids 496 to 607,and is a member of the HAMP domain family (HAMP) (PFAM AccessionPF00672), a member of the His Kinase A (phosphoacceptor) domain family(HisKA) (PFAM accession PF00512), and a member of the Histidine kinase-,DNA gyrase B-, and HSP90-like ATPase family (HATPase_c) (PFAM AccessionPF02518).

SEQ ID NO:12 contains a predicted Response_reg domain from about aminoacids 3 to 124 and a Trans_reg_C domain from about amino acids 160 to131, and is a member of the Response regulator receiver domain family(Response_reg) (PFAM Accession PF00072) and a member of theTranscriptional regulatory protein C family (Trans_reg_C) (PFAMAccession PF00496).

SEQ ID NO:14 contains a predicted HAMP domain from about amino acids 173to 242, a predicted HisKA domain located from about amino acids 253 to319 and a predicted HATPase_c domain from about amino acids 364 to 475,and is a member of the HAMP domain family (HAMP) (PFAM AccessionPF00672), a member of the His Kinase A (phosphoacceptor) domain family(HisKA) (PFAM accession PF00512), and a member of the Histidine kinase-,DNA gyrase B-, and HSP90-like ATPase family (HATPase_c) (PFAM AccessionPF02518).

SEQ ID NO:16 contains a predicted GGDEF domain from about amino acids200-363, and is a member of the GGDEF domain family (GGDEF) (PFAMAccession PF00990).

SEQ ID NO:18 contains a predicted EAL domain from about amino acids 4 to234, and is a member of the EAL domain family (EAL) (PFAM AccessionPF00563).

SEQ ID NO:20 contains a predicted HisKA domain located from about aminoacids 208 to 270, a predicted HATPase_c domain from about amino acids314 to 426, and is a member of the His Kinase A (phosphoacceptor) domainfamily (HisKA) (PFAM accession PF00512), and a member of the Histidinekinase-, DNA gyrase B-, and HSP90-like ATPase family (HATPase_c) (PFAMAccession PF02518).

SEQ ID NO:22 contains a predicted Response_reg domain from about aminoacids 1 to 120, and is a member of the Response regulator receiverdomain family (Response_reg) (PFAM Accession PF00072).

SEQ ID NO:24 contains a predicted HAMP domain from about amino acids 203to 274, a predicted HisKA domain from about amino acids 278 to 345, apredicted HATPase_c domain from about amino acids 391 to 502, and is amember of the HAMP domain family (HAMP) (PFAM Accession PF00672), amember of the His Kinase A (phosphoacceptor) domain family (HisKA) (PFAMaccession PF00512) and a member of the Histidine kinase-, DNA gyrase B-,and HSP90-like ATPase family (HATPase_c) (PFAM Accession PF02518).

SEQ ID NO:26 contains a predicted Response_reg domain from about aminoacids 2 to 120 and a predicted Trans_reg_C domain from about amino acids156 to 227, and is a member of the response regulator receiver domainfamily (Response_reg) (PFAM Accession PF00072) and a member of theTranscriptional regulatory protein C family (Trans_reg_C) (PFAMAccession PF00486).

SEQ ID NO:28 contains a predicted Response_reg domain from about aminoacids 20 to 138 and a predicted Trans_reg_C domain from about aminoacids 170 to 240, and is a member of the Response regulator receiverdomain family (Response_reg) (PFAM Accession PF00072) and a member ofthe transcriptional regulatory protein C family (Trans_reg_C) (PFAMAccession PF00486).

SEQ ID NO:30 contains a predicted HisKA domain from about amino acids223 to 290 and a predicted HATPase_c domain from about amino acids 330to 441, and is a member of the His Kinase A (phosphoacceptor) domainfamily (HisKA) (PFAM accession PF00512) and a member of the Histidinekinase-, DNA gyrase B-, and HSP90-like ATPase family (HATPase_c) (PFAMAccession PF02518).

SEQ ID NO:36 contains a predicted HisKA domain from about amino acids153 to 219 and a predicted HATPase_c domain from about amino acids 265to 376, and is a member of the His Kinase A (phosphoacceptor) domainfamily (HisKA) (PFAM accession PF00512) and a member of the Histidinekinase-, DNA gyrase B-, and HSP90-like ATPase family (HATPase_c) (PFAMAccession PF02518).

SEQ ID NO:38 contains a Response_reg domain from about amino acids 1 to120 and a predicted Trans_reg_C domain from about amino acids 150 to226, and is a member of the Response regulator receiver domain family(Response_reg) (PFAM Accession PF00072) and a member of theTranscriptional regulatory protein C family (Trans_reg_C) (PFAMAccession PF00486).

SEQ ID NO:40 contains a predicted DeoR domain from about amino acids 6to 231, and is a member of the Bacterial regulatory proteins, DeoRfamily (DeoR, PFAM Accession PF00455).

SEQ ID NO:44 contains a predicted Patatin domain from about amino acids9 to 176, and is a member of the Patatin-like phospholipase family(Patatin) (PFAM Accession PF01734).

SEQ ID NO:50 contains a predicted Band_(—)7 domain from about aminoacids 21 to 194, and is a member of the SPFH domain/Band 7 family(Band_(—)7) (PFAM Accession PF01145).

SEQ ID NO:58 contains a predicted MarR domain from about amino acids 35to 138, and is a member of the MarR family (MarR) (PFAM AccessionPF01047).

SEQ ID NO:60 contains an ABC_membrane domain from about amino acids 41to 307 and an ABC_tran domain from about amino acids 377 to 582, and isa member of the ABC transporter transmembrane region family(ABC_membrane) (PFAM Accession PF00664) and a member of the ABCtransporter family (ABC_tran) (PFAM Accession PF00005).

SEQ ID NO:72 contains a MatE domain from about amino acids 27 to 189,and is a member of the MatE domain family (MatE) (PFAM AccessionPF01554).

SEQ ID NO:82 contains a Peptidase_C39 domain from about amino acids 10to 145, an ABC_membrane domain from about amino acids 164 to 440 and anABC_tran domain from about amino acids 512 to 696, and is a member ofthe Peptidase C39 family (Peptidase_C39) (PFAM Accession PF03412), amember of the ABC transporter transmembrane region family (ABC_membrane)(PFAM Accession PF00664) and a member of the ABC transporter family(ABC_tran) (PFAM Accession PF00005).

SEQ ID NO:124 contains a predicted YycH domain from about amino acids 12to 429 and is a member of the YycH domain family (PFAM Accession No.PF07435).

SEQ ID NO:128 contains a predicted Lactamase_B domain from about aminoacids 11 to 219 and is a member of the Lactamase_B domain family (PFAMAccession No. PF00753).

SEQ ID NO:130 contains a predicted PDZ domain from about amino acids 315to 408, a predicted trypsin domain from about amino acids 132 to 312,and is a member of the PDZ domain family (PFAM Accession No. PF00595)and a member of the trypsin domain family (PFAM Accession No. PF00089).

SEQ ID NO:56 contains a predicted hydrolase domain from about aminoacids 6 to 243, and is a member of the hydrolase domain family (PFAMAccession No. PF00702).

SEQ ID NO:8 contains a domain with an E-value of 0.015 to a predictedHAT Pase_C domain from amino acids 321 to 425, and is a member of theHATPase_C domain family (PFAM Accession No. PF02518).

SEQ ID NO:10 contains a predicted response_reg domain from about aminoacids 3 to 140, a predicted LyTR domain from about amino acids 160 to254, and is a member of the response_reg domain family (PFAM AccessionNo. PF00072) and a member of the LytTR domain family (PFAM Accession No.PF04397).

SEQ ID NO:138 contains a predicted amino acid permease domain from aboutamino acids 13 to 498, and is a member of the AA_permease domain family(PFAM Accession No. PF00324).

SEQ ID NO:144 contains a predicted aldo/keto reductase domain from aboutamino acids 10 to 228, and is a member of the Aldo/keto reductase family(PFAM Accession No. PF00248).

SEQ ID NO:148 contains a predicted major facilitator super family domainfrom about amino acid 15 to 356 and is a member of the major facilitatorsuper family (MFS_(—)1) domain family (PFAM Accession No. PF07609).

SEQ ID NO:150 contains a predicted region found in RelA/SpoT proteinsfrom about amino acids 44 to 169, and is a member of the RelA_SpoTdomain family (PFAM Accession No. PF04607).

SEQ ID NO:48 contains a predicted flavodoxin domain from about aminoacids 67 to 224 (E-value equals 0.021) and is a member of theflavodoxin_(—)1 domain family (PFAM Accession No. PF00258).

SEQ ID NO:52 contains a predicted RelB antitoxin domain from about aminoacids 5 to 76 with an E-value of 0.0001 and is a member of the RelBdomain family (PFAM Accession No. PF04221).

SEQ ID NO:64 contains a predicted peptidase family M1 domain from aboutamino acids 31 to 416, and is a member of the peptidase M1 domain family(PFAM Accession No. PF01433).

SEQ ID NO:70 contains a punitive transposase DNA-binding domain fromabout amino acids 97 to 178, and is a member of the transposase_(—)35domain family (PFAM Accession No. PF07282).

SEQ ID NO:78 contains a predicted gram positive anchor domain from aboutamino acids 393 to 433, which is a member of the gram positive anchordomain family (PFAM Accession No. PF00746).

SEQ ID NO:32 contains a predicted LytTr DNA-binding domain from aboutamino acids 172 to 266, a predicted response regulator receiver domainfrom about amino acids 12 to 152, and is a member of the LytTRDNA-binding domain family (PFAM Accession No. PF04397) and a member ofthe response regulator receiver domain family (PFAM Accession No.PF00072).

SEQ ID NO:34 contains a HATPase_C domain from about amino acids 320 to434, and is a member of the histidine kinase-, DNA gyrase B-, andHSP90-like ATPace family (HATPase_C) (PFAM Accession No. PF02518).

SEQ ID NO:98 contains a predicted CAAX amino terminal protease familydomain from about amino acids 38 to 148, with an E-value of 0.00025,which is a member of the ABI domain family (PFAM Accession No. PF02517).

SEQ ID NO:102 contains a predicted CAAX amino terminal protease familydomain from about amino acids 243 to 353, with an E-value of 8.9e-06 andis a member of the ABI domain family (PFAM Accession No. PF02517).

SEQ ID NO:106 contains a predicted glycosylhydrolases family domain fromabout amino acids 185 to 296, and is a member of the glycosylhydrolasesfamily (Gylco_hydro_(—)31) (PFAM Accession No. PF01055).

SEQ ID NO:116 contains a predicted asp/glu/hydantoin racemase from aboutamino acids 2 to 231, and is a member of the asp/glu/hydantoin racemasedomain family (PFAM Accession No. PF01055).

SEQ ID NO:118 contains a predicted mur ligase family, glutamate ligasedomain from about amino acids 30 to 102, and is a member of the murligase family, glutamate ligase domain family (PFAM Accession No.PF02875).

SEQ ID NO:120 contains a predicted ABC transporter domain from aboutamino acids 408 to 592 and a predicted ABC transporter transmembraneregion located about amino acids 36 to 315, and is a member of the ABCtransporter domain family (PFAM Accession No. PF01061) and a member ofthe ABC transporter transmembrane region domain family (PFAM AccessionNo. PF00664).

SEQ ID NO:122 contains a predicted ABC transporter domain from aboutamino acids 360 to 544 and a predicted ABC transporter transmembraneregion located from about amino acids 16 to 287, and is a member of theABC transporter domain family (PFAM Accession No. PF01061) and a memberof the ABC transporter transmembrane region domain family (PFAMAccession No. PF00664).

SEQ ID NO:154 contains a predicted GTPase of unknown function from aboutamino acids 3 to 145, which is a member of the MMR_HSR1 domain family(PFAM Accession No. PF01926).

SEQ ID NO:158 contains a predicted ParB-like nuclease domain from aboutamino acids 37 to 126, and is a member of the ParB-like nuclease domainfamily (PFAM Accession No. PF02195).

SEQ ID NO:160 contains a predicted CobQ-CobB/MinD/ParA nucleotidebinding domain from about amino acids 5 to 221, and is a member of theCbiA domain family (PFAM Accession No. PF01656).

SEQ ID NO:162 contains a predicted ParB-like nuclease domain from aboutamino acids 20 to 109, and is a member of the ParBc domain family (PFAMAccession No. PF02195).

SEQ ID NO:164 contains a predicted glucose inhibited division proteinfrom about amino acids 21 to 215, and is a member of the GidB domainfamily (PFAM Accession No. PF02527).

Example 4 Microarray Analysis of a Two-Component Regulatory SystemInvolved in Acid Tolerance and Oligopeptide Transport Activity inLactobacillus acidophilus

Survival of microorganisms during their transit through thegastrointestinal tract requires the capability to sense and respond tothe various and changing conditions present in that environment.Two-component regulatory systems (2CRS) are one of the most importantmechanisms for environmental sensing and signal transduction. They arefound in the majority of gram-positive and gram-negative bacteria andcontrol housekeeping functions, as well as regulating proteins importantfor pathogenesis, stress and adherence (Cotter et al. (1999) J.Bacteriol. 181:6840-6843; Sebert et al. (2002) Infect. Immun.70:4059-4067; Teng et al. (2002) Infect. Immun. 70:1991-1996). A typical2CRS consists of a membrane-associated histidine protein kinase (HPK),which detects specific environmental signals, and a cytoplasmic responseregulator (RR), which regulates expression of one or more genes in aregulon (Parkinson (1993) Cell 73:857871). 2CRS are located in moduleswith varying arrangements of conserved domains (West and Stock (2001)TRENDS Biochem. Sci. 26:369-376). HPKs generally consist of a signalinput domain and an autokinase domain, which can be divided into two subdomains: a histidine phosphotransferase sub domain and an ATP-bindingsub domain. The RR is typically composed of a regulatory (receiver)domain and a DNA binding (output) domain (Hoch and Varughese (2001) J.Bacteriol. 183:4941-4949). Detection of an external signal by the inputdomain of the kinase controls its own activation. The active kinaseswill autophosphorylate via ATP hydrolysis, on a histidine residue. Thisphosphoryl group is then transferred to an aspartate residue in thereceiver domain of the RR that activates the regulatory protein andpromotes the transcriptional response (Foussard et al. (2001) MicrobesInfect. 3:417-424).

Genomic sequencing of microorganisms has uncovered the presence of many2CRS and promoted global analysis of their responses to differentenvironments. For those studies, DNA microarray technology involvinghigh-density arrays of open reading frame-specific fragments has beeninstrumental. Fabret et al. (Fabret et al. (1999) J. Bacteriol.181:1975-1983) identified and grouped 2CRS in Bacillus subtilis in fivedifferent groups and the function of these 2CRS have been investigatedby microarray analysis (Kobayashi et al. (2001) J. Bacteriol.183:7365-7370; Ogura et al. (2001) Nucleic Acids Res. 29:3804-3813).

In lactic acid bacteria (LAB), production of some class II bacteriocins(plantaricin, sakacin P, sakacin A, carnobacteriocin 132) istranscriptionally regulated through a signal transduction pathway whichconsists of three components: an inducer bacteriocin-like peptide, aHPK, and a RR (for a review see 25). In fact, the production of manysmall antimicrobial peptides appears to be modulated by a cell-densityresponse mechanism. Additionally, multiple 2CRS have been identified ina number of LAB (Miller and Bassler (2001) Annu. Rev. Microbiol.55:165-199; Morel-Deville et al. (1997) Microbiology 143:1513-1520). Forexample, six 2CRS were detected in Lactococcus lactis, with four of themimplicated in cellular responses to stress (O'Connell-Motherway et al.(2000) Microbiology 46:935-947).

Lactobacillus acidophilus NCFM is a probiotic organism that has beenused extensively in yogurt, fermented foods, and dietary supplements(Sanders and Klaenhammer (2001) J. Dairy Sci. 84:319-331). The annotatedgenome sequence of L. acidophilus NCFM encodes nine putative 2CRS(Altermann et al. (2004) Proc. Natl. Acad. Sci. U.S.A. 102:3906-3912).In this study, we identified a 2CRS similar to the lisRK systemdescribed in Listeria monocytogenes (Cotter et al. (1999) J. Bacteriol.181:6840-6843), which participates in both stress response and virulencein L. monocytogenes. The HPK gene from the LBA1524HPK-LBA1525RR systemwas disrupted to investigate its putative role in acid tolerance. Awhole genome array containing 97.4% L. acidophilus annotated genes wasconstructed and used to compare genome-wide transcriptional patterns ofthe control and the HPK mutant, exposed to three different pHs.

Materials and Methods Bacterial Strains and Growth Conditions

The bacterial strains used in this study were Escherichia coli EC 1000(RepA⁺ MC1000, Km^(R); host for pORI28-based plasmids, [Law et al.(1995) J. Bacteriol. 177:7011-7018]), and L. acidophilus strains: NCFM(human intestinal isolate; [Barefoot and Klaenhammer (1983) Appl.Environ. Microbiol. 45:1808-1815]), NCK1398 (NCFM lacL::pTRK685,[Russell and Klaenhammer (2001) Appl. Environ. Microbiol. 67:4361-4364])and NCK1686 (NCFM LBA1524::pTRK807, [this example]).

E. coli strains were propagated at 37° C. in Luria-Bertani (LB, DifcoLaboratories Inc., Detroit, Mich.) broth with shaking Erythromycin (Em)resistant clones of E. coli were selected on brain heart infusion (BHI)agar (Difco) supplemented with Em (150 μg/ml). Lactobacilli werepropagated statically at 37° C. in MRS (Difco) or on MRS supplementedwith 1.5% agar. When appropriate, Em (5.0 μg/ml) and/or chloramphenicol(Cm, 7.0 μg/ml) was added. Reconstituted skim milk (10% SM) and 10% SMsupplemented with 1% yeast extract (Difco) or 0.25% casaminoacids(Difco) were used for determination of acidification rates.

Standard DNA Techniques

Restriction enzymes (Roche Molecular Biochemicals, Indianapolis, Ind.)and T4 DNA ligase (New England Biolabs, Beverly, Mass.) were usedaccording to the suppliers' recommendations. Plasmid preparations fromE. coli were performed using the QIAprep Spin Plasmid Minipreps kit(QIAGEN Inc., Valencia, Calif.). Chromosomal DNA from L. acidophilus wasextracted according to Walker and Klaenhammer (Walker and Klaenhammer(1994) J. Bacteriol. 176:5330-5340). Electrotransformation of L.acidophilus was carried out as described by Walker et al. (Walker et al.(1996) FEMS Microbiol. Lett. 138:233-237). PCR was performed by standardprotocols using Taq DNA polymerase (RocheMolecular Biochemicals).

DNA Sequence Analysis and Data Submission

Potential coding sequences were derived from the genomic sequence of L.acidophilus NCFM (Genbank accession number CP000033, [Altermann et al.(2004) Proc. Natl. Acad. Sci. U.S.A. 102:3906-3912]). Protein sequencesimilarity analysis was conducted using the BlastP module (Altschul etal. (1997) Nucleic Acids Res. 25: 3389-3402) at NCBI(nebi.nlm.nih.gov/). TMHMM (cbs.dtu.dklservices/TMHMM) was used topredict transmembrane helices in proteins. CD-Search (Marchler-Bauer etal. (2003) Nucleic Acids Res. 31:383-387) was employed to identifyconserved domains in protein sequences.

Microarray platform and data are available at the Gene ExpressionOmnibus (GEO rhtto://www.nebi.nlm.nih.goy/ge2]) under accession numbersGPL1401 (platform) and GSE1976 (series).

RNA Isolation and RNA Slot Blots

Aliquots (10 ml) of L. acidophilus cultures grown on MRS to A₆₀₀=0.3were transferred to MRS (adjusted to desired pH with lactic acid). After30 minutes, cells were harvested by centrifugation and frozenimmediately in a dry ice/ethanol bath. One ml Trizol (Life Technologies,Rockville, Md.) was added to the cell pellets and they were homogenizedin a Mini-Beadbeater-8 cell disruptor (Biospec Products, Bartlesville,Okla.) for five 1-min cycles (and chilled on ice for 1 min between thecycles), the phases were separated by centrifugation (14,000 rpm, 15min, 4° C.). The aqueous phase was removed to a fresh tube and 0.4 ml ofTrizol and 0.2 ml of chloroform were added. The mixture was vortexed for15 s and centrifuged to separate the phases. The Trizol step wasrepeated twice and RNA was precipitated from the final aqueous phase byadding 1 volume of isopropanol, followed by incubation at roomtemperature for 10 min and centrifugation (12,000 rpm, 10 min, 4° C.).Concentration and purity of RNA samples were determined byelectrophoresis on agarose gels and standard spectrophotometermeasurements.

Total RNA hybridizations using a slot-blot apparatus (Bio-Dot SF,Bio-Rad) and Zeta-Probe membrane (Bio-Rad Laboratories, Inc.) werecarried out as previously described (Durmaz et al. (2002) J. Bacteriol.184:6532-6543). [α-³²P]dCTP-labeled probes were generated from PCRfragments using the Multiprime DNA labeling system (Amersham PharmaciaBiotech Inc., Piscataway, N.J.) and purified using the NucTrap Probepurification columns (Stratagene, La Jolla, Calif.). The primersutilized are listed in Table 2. Radioactive signals were detected byusing a Kodak Biomax film and autoradiographs were analyzed bydensitometry using the SpotDenso function with auto-linked background onan Alphalmager 2000 (Innotech Scientific). Primers as set forth in table2 are denoted in the sequence listing as follows LBA 0197 (SEQ IDNO:167), for LBA 1300 (SEQ ID NO:168), for LBA 1524 (SEQ ID NO:169), forLBA 1525 (SEQ ID NO:170), for LBA 0698 (SEQ ID NO:171), for LBA 1075(SEQ ID NO:175), for LBA 1196 (SEQ ID NO:176).

TABLE 2 Primers utilized for probe generation in Northern blot analysis.ORF Description Primers* LBA0l97ABC transporter, oligopeptide binding protein OppA1 F: 5′gcagcatgtagtagtaataa 3′ R: 5′ cagaatcacgtaatgtgtaa 3′ LBA1300Oligopeptide ABC transporter, substrate binding protein F: 5′atgcaatagattgacgaaga 3′ OppA2 R: 5′ atgcaatatggtgctgaatc 3′ LBA1524Two-component sensor histidine kinase F: 5′ gatctctaga-cagcgctctagca 3′R: 5′ gatcagatct-tcggccaatgtg 3′ LBA1525 Two component system regulatorF: 5′ gatctctaga-cacgaaccgtctt 3′ R: 5′ gatcagatct-ttggctcgatttg 3′LBA0698 Glyceraldehyde-3-P dehydrogenase F: 5′ tcgtagttgacggtaagaag 3′R: 5′ acctgcagtagttaccatag 3′ LBA1075 Malolactic enzyme F: 5′gttgttacagacggtgaagg 3′ R: 5′ taatgcacgaccatcagtcc 3′ LBA1196RNA polymerase sigma factor RpoD F: 5′ gatctctaga-ttccgcttcttact 3′R: 5′ gatcagatct-atctgacgaatacg 3′ *Dashes indicate the introduction ofrestriction enzyme sites.Generation of Lactobacillus acidophilus DNA Microarray

A whole genome DNA microarray based on the PCR products of predictedORFs from the L. acidophilus genome was used for global gene expressionanalysis. PCR primers for 1,966 genes were designed using GAMOLAsoftware (Altermann and Klaenhammer (2003) OMICS 7:161-169) andpurchased from Qiagen Operon (Alameda, Calif.). Total genomic DNA fromL. acidophilus NCFM was used as a template for 96-well PCRamplifications. To amplify gene-specific PCR products, a 100 μl reactionmix contained: I μl L. acidophilus DNA (100 ng/l), 10 μl specific primerpairs (10 μM), 0.5 μl of dNTP mix (10 mM), 10 μl PCR buffer (10×), and 1μl Taq DNA polymerase (5 U/μl [Roche Molecular Biochemicals]). Thefollowing PCR protocol was used: an initial denaturation step for 5 minat 94° C. followed by 40 cycles of denaturation at 94° C. for 15 sec,annealing at 50° C. for 30 sec and polymerization at 72° C. for 45 sec.Approximately 95% of ORFs produced a unique PCR product between 100-800bp. The size of fragments was confirmed by electrophoresis in 1% agarosegels. DNA from 96-well plates were purified using the QiagenPurification Kit. In general, the total quantity of each PCR product wasgreater than 1 μg. The purified PCR fragments were spotted three timesin a random pattern on glass slides (Corning, Acton, Mass.) using theAffymetrix® 417™ Arrayer at the NCSU Genome Research Laboratory(cals.ncsu.edu:8050/grl/). To prevent carry-over contaminations, pinswere washed between uses in different wells. Humidity was controlled at50-55% during printing. DNA was cross-linked to the surface of the slideby UV (300 mJ) and posterior incubation of the slides for 2 h at 80° C.The reliability of the microarray data was assessed by hybridization oftwo cDNA samples prepared from the same total RNA, labeled with Cy3 andCy5. Hybridization data revealed a linear correlation in the relativeexpression level of 98.6% of 5685 spots (each gene by triplicate) withno more than a two-fold change.

cDNA Probe Preparation and Microarray Hybridization

Identical amounts (25 μg) of DNAse treated (Invitrogen) RNA wereaminoallyl-labeled by reverse transcription with random hexamers in thepresence of amino-allyl dUTP (Sigma Chemical Co.), using Superscript IIreverse transcriptase (Life Technologies) at 42° C. overnight, followedby fluorescence-labeling of amino allylated cDNA withN-hydroxysuccinimide-activated Cy3 or Cy5 esters (Amersham PharmaciaBiotech). Labeled cDNA probes were purified using the PCR PurificationKit (Qiagen). Coupling of the Cy3 and Cy5 dyes to the AA-dUTP labeledcDNA and hybridization of samples to microarrays were performedaccording to the protocols outlined in the TIGR protocols website(tigr.org/tdb/microarray/protocosTGR.shtml). Briefly, combined Cy5- andCy3-labeled cDNA probes were hybridized to the arrays for 16 h at 42° C.After hybridization, the slides were washed twice in low stringencybuffer (1×SSC containing 0.2% SDS) for 5 min each. The first wash wasperformed at 42° C. and the second one at room temperature.Subsequently, the slides were washed in a high stringency buffer(0.1×SSC containing 0.2% SDS, for 5 min at room temperature) and finallyin 0.1×SSC (2 washes of 2.5 min each at room temperature).

Data Normalization and Gene Expression Analysis

Immediately after washing of the arrays, fluorescence intensities wereacquired at 10 μm resolution using a ScanArray 4000 Microarray Scanner(Packard Biochip BioScience, Biochip Technologies LLC, Mas.) and storedas TIFF images. Signal intensities were quantified, the background wassubtracted and data was normalized using the QuantArray 3.0 softwarepackage (Perkin Elmer). Two slides (each containing triplicate arrays)were hybridized reciprocally to Cy3- and Cy5-labeled probes perexperiment (dye swap). Spots were analyzed by adaptive quantitation.Data was median normalized. When the local background intensity washigher than the spot signal (negative values) no data was considered forthose spots. The median of the six ratios per gene was recorded. Theratio between the average absolute pixel values for the replicated spotsof each gene with and without treatment represented the fold change ingene expression. All genes belonging to a potential operon wereconsidered for analysis if at least one gene of the operon showedsignificant expression changes, and the remaining genes showed trendstoward that expression. Confidence intervals and P values on the foldchange were also calculated with the use of a two-sample t test. Pvalues of 0.05 or less were considered significant (Knudsen (2002) “ABiologist's Guide to Analysis of DNA Microarray Data,” (John Wiley &Sons, Inc., New York)).

Construction of the Histidine Protein Kinase Mutant

A 766-bp internal fragment of ORF LBA1524 was amplified using L.acidophilus NCFM chromosomal DNA as template and the primers 11 524F(5′-gatctagacagcgctctagca-3′) and 11 524R (5′-gatcgatcttcggccaatgtg-3′).The internal fragment was cloned in the integrative vector pORI28 (Lawet al. (1995) J. Bacteriol. 177:7011-7018) generating pTRK807, andintroduced by electroporation in L. acidophilus NCFM containing pTRK669(Russell and Klaenhammer (2001) Appl. Environ. Microbiol. 67:4361-4364).

Subsequent steps to facilitate the integration event were carried outaccording to Russell and Klaenhammer (Russell and Klaenhammer (2001)Appl. Environ. Microbiol. 67:4361-4364). The suspected integrants wereconfirmed by PCR and Southern hybridization analysis, using standardprocedures.

Acid Challenge and Adaptation Assays

For acid challenge analysis, cells were grown to an absorbance at 600 nm(A₆₀₀) of 0.25-0.3 (pH>5.8) from a 2% inoculum in MRS broth. Cultureswere centrifuged and resuspended in the same volume of MRS adjusted topH 3.5 with lactic acid at 37° C. Survival was determined at 30 minutesintervals by plating serial dilutions in a 10% MRS broth diluent ontoMRS agar using a Whitley Automatic Spiral Plater (Don Whitley ScientificLimited, West Yorkshire, England).

For acid adaptation assays, cells were grown to an A₆₀₀ of 0.25-0.3(pH>5.8). Cells were centrifuged and resuspended in the same volume ofMRS pH 5.5 (adjusted with lactate) and incubation continued for 1 hourat 37° C. as described previously (Azcarate-Peril et al. (2004) Appl.Environ. Microbiol. 70:5315-5322). Controls were resuspended in MRSbroth at pH 6.8. The cells from the adapted (pH 5.5) and control (pH6.8) cultures were then centrifuged and resuspended in MRS broth at pH3.5 (adjusted with lactic acid). Viable-cell counts were performed at 30minutes intervals for 2.5 h by plating on MRS agar.

Ethanol Tolerance

Log phase cells at an A₆₀₀ of 0.25-0.3 (pH>5.8) from a 2% inoculum inMRS broth were centrifuged and resuspended in the same volume of NMS, orMRS containing 15 or 20% (v/v) ethanol. CFU/ml were determined at 30minutes intervals by serial dilutions in 10% MRS and enumeration on MRSagar as described above.

Results Two-Component Regulatory Systems (2CRS)

Using CD-search (Marchler-Bauer et al. (2003) Nucleic Acids Res.31:383-387) and BlastP (Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402) programs, we identified nine signal transduction systemsconsisting of a histidine protein kinase (HPK) and a response regulator(RR; [Altermann et al. (2004) Proc. Natl. Acad. Sci. U.S.A.102:3906-3912]). These 2CRS represented almost 1% of L. acidophilus NCFMORFs. Additionally, four RRs were identified containing a LytTR DNAbinding motif that were not associated with a histidine kinase. HPKsshare a characteristic kinase core composed of a dimerization domain anda catalytic domain for ATP binding and phosphate transfer. TheC-terminal half of the HPK proteins showed five conserved amino acidmotifs: the H box, containing the His residue that will bephosphorylated, and the N, G1, F and G2 boxes (Stock et al. (2000) Annu.Rev. Biochem. 69:183-215). ORFs LBA0079HPK, LBA0747HPK, LBA1524HPK,LBA1430HPK, LBA1660HPK and LBAI819HPK were assigned to the groupIIIA/OmpR of HPKs in accordance with the region surrounding thehistidine that becomes phosphorylated; whereas the HPKs LBA0602HPK andLBA1799HPK, were categorized in the Class IV (Fabret et al. (1999) J.Bacteriol. 181:1975-1983). The remaining 2CRS (LBA1413-LBA1414) couldnot be classified into any known category. LBA1413 showed a Domain ofUnknown Function with GGDEF motif (smart00267, DUF 1), which apparentlyoccurs exclusively in eubacteria and might participate in prokaryoticsignaling processes. LBA1414 showed also a domain of unknown function(cd01948, EAL), which is found in diverse bacterial signaling proteins.Together with the GGDEF domain, EAL might be involved in regulating cellsurface adhesiveness in bacteria (Galperin et al. (2001) FEMS MicrobiolLett. 203:11-21).

Response regulators contain two conserved domains. First a regulator,which receives the signal from the sensor partner in bacterial 2CRS. Itcontains a phosphoaceeptor site that is phosphorylated by the histidinekinase. Second, a DNA binding effector domain in the C terminus of theprotein. RRs present in L. acidophilus contained these two conserveddomains. The RRs ranged from 221 to 274 amino acids in size. ORFsLBA0078RR, LBA0746RR, LBA1525RR, LBA1431RR, LBA1659RR and LBA1820RR canbe included in the OmpR family of response regulators according to theamino acid sequence of their output domains, where the residues involvedin the hydrophobic core of the domain are conserved (Martinez-Hackertand Stock (1997) Structure 5:109124). The response regulators encoded byLBA603RR and LBAi 798RR can be defined as members of the AlgR/AgrA/LytRfamily of RRs (Nikolskaya and Galperin (2002) Nucleic Acids Res.30:2453-2459).

The 2CRS composed of LBA1524HPK and LBA1525RR formed an operon flankedby two terminators with a free energy of −11.0 and −13.8 Kcal/mol,respectively. Also, a typical RBS sequence and a putative promoter werepositioned upstream of LBA1525RR (FIG. 1). The histidine protein kinasegene showed a 36% identity with the HPK in the lisRK system described inListeria monocytogenes (Cotter et al. (1999) J. Bacteriol.181:6840-6843). This two component signal transduction system was shownto participate in the stress response and virulence of L. monocytogenes.A lisRK-defective mutant, generated by random insertional mutagenesis,grew at higher concentrations of ethanol than the parental strain, butwas more sensitive to acid stress during logarithmic phase of growth(Cotter et al. (1999) J. Bacteriol. 181:6840-6843). LBA1524HPK alsoshowed homology (32% identity and 55% similarity) to the HPK gene ofcsrRS, a system that represses the expression of the hyaluronic acidcapsid and virulence factors of Streptococcus, SLS and SpeB (Heath etal. (1999) Infect. Immun. 67:5298-5305).

Insertional Inactivation of LBA1524UPK and Acid Stress Assays

To investigate the physiological function of LBA I 524HPK-LBA I 525RR2CRS and to examine its putative association with acid tolerance in L.acidophilus, a chromosomally interrupted LBA1524HPK mutant wasconstructed. For insertional inactivation of the HPK, a 766-bp internalregion was amplified by PCR using the primers I1524F-I1524R described inMaterials and Methods. This fragment was cloned into pORI28 and theresulting plasmid, pTRK807, was then transferred by electroporation intoL. acidophilus NCFM, already harboring the helper plasmid pTRK669.Integrants were isolated as described by Russell and Klaenhammer(Russell and Klaenhammer (2001) Appl. Environ. Microbiol. 67:4361-4364)to generate strain NCK1686. PCR experiments and Southern hybridizationswere performed to confirm the integration event via junction ampliconsand fragments (data not shown). Because this operon was flanked by twoputative terminators, polar effects from the inactivation of LBA1524HPKwere not expected. Phase-contrast microscopy analyses of the HPK mutantrevealed a decrease in cell size and chain length compared to the wildtype, NCFM cells (data not shown).

Two strong transmembrane regions can be predicted, by in silicoanalysis, in the histidine protein kinase of LBA I 524BPK-LBA 15 25RR2CRS (from 24 to 42 aa, and 202 to 226 aa). The ATP-bindingphosphotransfer (catalytic domain) and the dimerization domain can belocated in the carboxy termini of the protein from 396 to 499 as andfrom 276 to 341 aa, respectively. The 766-bp internal region ofLBA1524HPK, amplified by PCR using primers I 1524F-11524R, used toinactivate the HPK spanned from 51 to 347 aa. As a consequence,insertion of the vector would have affected the second transmembraneand/or the dimerization domain of the HPK.

The response of log phase cells to pH 3.5 was compared between the HPKmutant strain NCK1686 and control, L. acidophilus NCK1398 (NCFM::lacL).Strain NCK1398 was used as a control throughout the study so that theeffects of antibiotic pressure could be accounted for. When log phasecells of NCK1686 were exposed to pH 3.5, more than a 2-log reduction incfu was observed after 2.5 hours, compared to a half-log reduction inthe control (FIG. 3A). Therefore, similar to L. monocytogenes (Cotter etal. (1999) J. Bacteriol. 181:6840-6843), the HPK mutant was moresensitive to acid indicating that the LBA1524HPK-LBA1525RR 2CRS plays asignificant role in acid resistance of L. acidophilus.

Acid Adaptation of L. acidophilus

Log phase cells of L. acidophilus NCK1398 and NCK1686 were exposed to pH5.5 for 1 h, prior to challenge by pH 3.5. Remarkably, both the controland HPK mutant exhibited a high tolerance to acid challenge (FIG. 2B).Exposure to pH 5.5 appeared to adapt the cells to a higher level of acidtolerance during challenge at pH 3.5. Acid sensitivity incurred by theLBA1524HPK mutant over 150 min was nearly abolished by the adaptationperiod at pH 5.5, but after 2.5 h at pH 3.5 the mutant still remainedmore sensitive than the control.

Global Gene Expression of the HPK Mutant

In an attempt to identify genes regulated by the 2CRS, and potentiallyaffected by inactivation of the LBA1524HPK ORF, parallel cultures of thecontrol strain NCK1398 (NCFM::lacL) and the HPK mutant (NCK1686) weregrown in MRS broth to an optical density of 0.3 and transferred to MRSadjusted to pH 6.8, 5.5, or 4.5. After 30 minutes, RNA was isolated andused for hybridization to microarray slides printed with representativesequences of the majority of the identified ORFs on the L. acidophilusgenome. Statistically significant (P≦0.05) gene expression changes wereconsidered for ORFs exhibiting at least a two-fold change.

Comparison of the expression profiles identified 80differentially-expressed genes showing at least two-fold changes inexpression patterns (Table 3). As expected, the components of the LBA I524HPK-LBA I 525RR 2CRS, as well as the large and small subunits of the(i-galactosidase and UDP-glucose 4-epimerase were differentiallyexpressed, owing to the inactivation of these genes in the comparedstrains. Surprisingly, the inactivated HPK gene and the RR were overexpressed in the NCK1686 mutant. This might be attributable toamplification of the vector in the chromosome and/or a readthrough eventwhere a longer transcript is generated, but not translated into afunctional protein. The same effect was observed for NCFM::lacL wherethe disrupted operon appeared to be highly expressed. Alternatively, anon-functional HPK could result in elevated transcriptional expressionof the 2CRS, if the phosphorylated form of the RR was involved in theauto regulation of the 2CRS.

TABLE 3 Open reading frames differently expressed in the HPK mutant (NCK1686) compared to the control L. acidophilus NCK1398 (NCFM::lacL) underdifferent pH conditions¹. Relative mRNA ratio (HPK/WT)³ COG² FunctionalClassification/Gene pH 6.8 pH 5.5 pH 4.5 Amino acid transport andmetabolism [E] LBA0111 Putative ABC transporter (glutamine), ATP bindingprotein 0.36 0.47 0.57 LBA0112 Putative ABC transporter (glutamine),substrate binding protein 0.53 0.71 0.83 LBA0197 ABC transporter,oligopeptide binding protein oppA1 6.22 4.43 1.92 LBA0198 ABCtransporter, oligopeptide binding protein oppA 1B 7.42 1.96 1.59 LBA0200ABC transporter, oligopeptide permease protein oppB1 6.70 6.31 1.67LBA0201 ABC transporter, oligopeptide permease protein oppC1 7.27 8.092.55 LBA0202 Oligopeptide ABC transporter, ATP binding protein oppD17.44 7.09 3.89 LBA0203 Oligopeptide ABC transporter, ATP binding proteinoppF1 8.01 7.10 5.57 LBA0849 Diaminopimelate epimerase 1.30 2.15 0.91LBA0850 Aspartokinase/homoserine dehydrogenase 1.33 1.89 0.79 LBA0911Aminopeptidase pepC 1.79 1.79 0.72 LBA0943 Cationic amino acidtransporter 2.94 2.90 1.51 LBA1042 ABC transporter (glutamine)membrane-spanning permease 0.93 0.44 0.71 LBA1044 ABC transporter(glutamine) membrane-spanning permease 0.91 0.47 0.72 LBA1045 ABCtransporter (glutamine) ATP-binding protein 0.78 0.39 0.82 LBA1046 ABCtransporter (glutamine) substrate-binding protein 0.84 0.43 0.71 LBA1080Putative methionine synthase metK 6.96 5.51 4.32 LBA1086 Amino acidpermease 3.44 1.73 1.64 LBA1135 Macrolide efflux protein 1.12 2.00 1.21LBA1211 Homoserine kinase khsE 1.84 1.67 1.32 LBA1212 Homoserinedehydrogenase hdh 2.25 1.50 1.20 LBA1300 Oligopeptide ABC trasporter,substrate binding protein oppA2 0.35 0.47 0.36 LBA 1301 Oligopeptide ABCtrasporter, substrate binding protein oppA2B 4.92 1.74 1.37 LBA1302Oligopeptide ABC transporter, permease protein oppC2 1.29 2.14 1.43LBA1303 ABC transporter, oligopeptide permease protein oppB2 1.50 1.981.49 LBA1305 Oligopeptide ABC transporter, ATP binding protein oppF21.50 2.00 1.33 LBA1306 Oligopeptide ABC transporter, ATP binding proteinoppD2 1.24 2.22 1.35 LBA1341 Branched-chain amino acid aminotransferaseILVE 2.13 1.08 1.26 LBA1515 Peptidase T pepT 2.26 2.05 1.32 LBA1665Oligopeptide ABC transporter, substrate binding protein 0.38 0.15 0.58LBA1837 Cytosol non-specific dipeptidase pepD/A 1.03 1.55 3.10 LBA1961Oligopeptide ABC transporter, substrate binding protein 2.05 1.91 1.08Carbohydrate transport and metabolism [G] LBA0600Xylulose-5-phosphate/fructose phosphoketolase 1.31 3.22 0.78 LBA1467Beta-gatactosidase large subunit (lactase) 0.07 0.17 0.27 LBA1468Beta-galactosidase small subunit 0.17 0.43 1.05 LBA1777 PTS system,fructose-specific enzyme II 0.98 1.26 0.41 LBA1778 Fructose-1-phosphatekinase 1.00 1.30 0.29 LBA1779 Transcriptional repressor (fructoseoperon) 0.92 1.32 0.35 LBA1870 Maltose phosphorylase 0.67 0.90 0.21LBA1872 Oligo-1,6-glucosidase 0.97 0.88 0.46 Inorganic ion transport andmetabolism [P] LBA0319 ABC transporter, ATP binding protein 1.19 1.061.88 LBA0320 ABC transporter, ATP binding protein 1.25 0.87 1.96 LBA0321ABC transporter, permease protein 1.34 1.33 2.21 LBA0904 Outer membranelipoprotein precursor 2.11 2.11 1.19 LBA0905 ABC transporter, ATPbinding protein 2.08 2.07 1.40 LBA0906 ABC transporter, permease protein1.99 2.88 2.19 LBA1683 Cation-transporting ATPase 7.95 1.92 1.83 Signaltransduction mechanisms [T] LBA0149 Hypothetical protein 1.28 1.01 0.56LBA0403 Hypothetical protein 1.01 1.30 1.21 LBA1081 Autoinducer-2production protein luxS 1.69 2.27 1.51 LBA1524 Two-component sensorhistidine kinase 1.17 2.82 0.97 LBA1525 Two-component system regulator2.09 1.54 1.03 Defense mechanisms [V] LBA0074 ABC transporter, ATPbinding and permease protein 2.27 0.96 1.07 LBA0075 ABC transporter, ATPbinding and permease protein 3.01 3.90 2.94 LBA1838 ABC transporter,ATP-binding protein 1.55 4.15 7.37 LBA1839 Putative permease 1.48 5.168.72 LBA1876 ABC transporter, ATP-binding/membrane spanning protein 1.791.98 1.59 Posttranslational modification, protein turnover, chaperones[O] LBA0165 Neutral endopeptidase pepO 2.91 3.28 1.97 LBA1512 ProteinaseP precursor prtP 7.53 7.02 1.58 LBA1564 Putative membrane protein 1.421.47 2.08 Cell wall/membrane/envelope biogenesis [M] LBA0018 Unknown0.90 1.00 0.55 LBA1469 UDP-glucose 4-epimerase 0.18 0.51 0.67Transcription [K] LBA1840 Transcriptional regulator (TetR/AcrR family)1.33 3.52 12.60  General function prediction only [R] LBA0367 Putativereceptor 1.04 1.74 1.26 Energy production and conversion [C] LBA0463Acetate kinase 2.31 1.12 1.27 Translation, ribosomal structure andbiogenesis [J] LBA0672 Putative phosphate starvation induced proteinyvyD 1.12 0.93 0.43 Intracellular trafficking, secretion, and vesiculartransport [U] LBA1496 Putative fibrinogen-binding protein 3.56 2.35 1.22Replication, recombination and repair [L] LBA1565 Unknown 2.02 1.43 1.37Function unknown/General function prediction only [S], [R] LBA0555Myosin-crossreactive antigen 1.05 0.98 0.43 LBA0872 Putative membraneprotein 2.14 5.27 2.97 LBA1119 Putative inner membrane protein 4.22 3.466.24 LBA1869 Beta-phosphoglucomutase 0.69 0.70 0.24 LBA1952 Hypotheticalprotein 1.07 0.89 2.26 No COG found LBA0352 Hypothetical protein 0.940.87 0.47 LBA0402 Unknown 1.01 0.81 1.38 LBA0404 Hypothetical protein0.88 0.60 0.95 LBA1495 >>Putative fibrinogen-binding protein<< 1.62 0.971.15 LBA1611 Surface protein fmtB 0.56 0.93 0.94 LBA1690 Putativemembrane protein 1.62 1.32 2.34 ¹Array ratios from two biologicalreplicates and two technical replicates for each condition wereaveraged. ²Clusters of Orthologous Groups (37). Genes were classifiedaccording to the COG domain present in the potentially encoded proteinsequence. ³Values in boldface indicate ratios that meet the P criteria(P < 0.05).

The most dramatic changes in expression in the HPK mutant were observedin genes predicted to encode components of the proteolytic enzymesystem. Proteolyitc systems of lactic acid bacteria are divided intothree functional categories 1) proteinases (that degrade casein intosmall peptides); 2) transport systems (that import those peptides) and3) peptidases (Kunji et al. (1996) Antonie van Leeuwenhoek 70:187-221).The expression of ORF LBA1512 encoding the proteinase precursor in L.acidophilus, PAP (39% identical and 53% similar to the cell envelopeproteinase PrtR from L. rhamnosus GI27527536), increased in the HPKmutant more than 7-fold at pH 6.8 and 5.5 (Table 3). However, PrtM(LBA1588), the protein putatively involved in the maturation of theproteinase, showed expression levels comparable to the control strain(ratios between 0.8 and 1.1).

Two operons potentially encoding oligopeptide ABC transporters arepresent in the L. acidophilus genome (FIG. 3), opp1 (ORFs LBA0197 toLBA0203) and opp2 (ORFs LBA1300 to LBA1306). Each consist of six genes,opp1 consists of oppD1, oppF1, oppB1, oppC1, oppA1, and oppA1B, and opp2consists of oppD2, oppF2, oppB2, oppC2, oppA2, and oppA2B, coding fortwo ATP-binding proteins (OppD, OppF), two membrane proteins (OppB, OppCand two substrate-binding proteins (oppA and OppA-B). The oppA andoppA-B genes in both operons are separated by terminators from thedownstream genes. Expression of the opp1 operon was significantlyincreased in the HPK mutant at pH 6.8 and 5.5 showing increments of 6 to8-fold in most of the genes in the operon. The ORFs encoded by opp2showed an increased expression in the mutant at pH 5.5. Interestingly,oppA2 (LBA1300) was down regulated in the mutant under all the evaluatedconditions, but the expression of oppA2B (LBA1301) increasedsignificantly at pH 6.8 in the mutant (Table 3). Two other ORFs encodingputative oligopeptide binding proteins were differentially expressed inthe mutant. LBA1665 was consistently under expressed in the HPK mutantat the three pHs. In contrast, LBA1961 was over expressed at pH 6.8 and5.5.

Four peptidases were also differentially expressed in the HPK mutantstrain. A neutral endopeptidase PepO (LBAO165) was up regulated at allpHs evaluated. The aminopeptidase encoded by LBA0911, and peptidase T(LBA1515) were up regulated at pH 5.5. Finally, a cytosol non-specificdipeptidase encoded by ORF LBA1837 was significantly up regulated at pH4.5.

To investigate potential alterations in the proteolytic system of theHPK mutant, we compared the acidification rates of L. acidophilus NCFM(wt; since NCK 1398 does not grow in milk) versus the HPK mutant in 10%skim milk (SM) and in 10% SM plus yeast extract (FIG. 4A). The HPKmutant was not able to acidify SM below pH 5.0, compared to the controlwhere the pH dropped to nearly pH 4.0. Supplementation of SM with 0.5%yeast extract completely restored a wild-type level of acidificationactivity in the HPK mutant. In addition, supplementation of SM with0.25% casamino acids also nearly abolished the difference between the wtand the HPK mutant (FIG. 4B). These data suggest that the mutant wasdeficient in proteolytic activity. In addition, other component(s)present in yeast extract further stimulated the acidification rates ofboth the parent and mutant to equal levels in skim milk.

Expression of LBA1080 (a putative methionine synthase) and LBA1081(luxS) was increased up to 6.9-fold under all conditions in the HPKmutant. At the amino acid level, the LuxS (LBA1081) homolog in thegenome sequence of L. acidophilus was 77% identical and 84% similar tothe S-ribosylhomocysteinase (autoinducer-2 production protein LuxS) fromL. plantarum (Kleerebezem et al. (2003) Proc. Natl. Acad. Sci. U.S.A.100:1990-1995), and 73% identical and 89% similar to LuxS from S.pyogenes (Lyon et al. (2001) Mol. Microbiol. 42: 145-157). Examinationof the surrounding chromosomal region suggested that luxS is the secondmember of an operon consisting of five genes whose function is poorlycharacterized. A putative rho-independent terminator with a low freeenergy of −8.5 Kcal/mol was present downstream of luxS.

Among the global transcriptional changes observed in the HPK mutant, twokey enzymes involved in lysine biosynthesis, aspartate kinase (EC2.7.2.4, LBA0850) and diaminopimelate epimerase (EC 5.1.1.7, LBA0849),were up-regulated at pH 5.5. Additionally, a putative operon composed ofa cytosol non-specific dipeptidase, an ABC transporter and atranscriptional regulator from the TetR/AcrR family (ORFs LBA1737 to LBA1840) was highly up regulated at pH 4.5.

Given the similarity of LBA1524HPK with lisK, the HPK from L.monocytogenes (Barefoot and Klaenhammer (1983) Appl. Environ. Microbiol.45:1808-1815), and the fact that a lisK-deficient mutant was able togrow at a higher concentration of ethanol than its parent strain,survival of the L. acidophilus HPK mutant was investigated in thepresence of ethanol. No differences were observed when log-phase cellswere exposed to 15% (v/v) ethanol indicating that L. acidophilus isnaturally highly resistant. However, at 20% ethanol the HPK mutantshowed a 4-log reduction in survival after 90 min compared to only a1-log reduction in the control (data not shown).

Confirmation of DNA Microarray Results by Northern Blotting

Cells of the control and the HPK mutant strains were harvested at anA₆₀₀ of 0.3 and exposed to pH 6.8, 5.5, and 4.5 in MRS broth for 30minutes. Total RNA was prepared and hybridized with several labeledprobes. For analysis of gene expression, DNAs of the ORFs indicated inTable 2 were amplified by PCR and labeled with α-³²p. Selected foranalysis by Northern blot, were oppA1 (LBA0197, up regulated in the HPKmutant), oppA2 (LBA1300, down regulated in the HPK mutant cells), andLBA1524HPK and LBA1525RR (components of the inactivated 2CRS) genes.Genes encoding a glyceraldehyde-3-P dehydrogenase (LBA0698), malolacticenzyme (LBA1075), and RNA polymerase sigma factor rpoD (LBA1196) werealso evaluated as controls because these were not differentiallyexpressed at the different pH conditions when evaluated in themicroarrays (data not shown).

The hybridized membranes and comparison between relative expressionratios obtained by Microarray and Northern analysis are shown in FIGS.5A and B. The transcription levels of the selected genes, as measured bythe DNA microarray method, were consistent with those measured byNorthern hybridizations, with one exception. The amounts of RNA detectedfor the disrupted gene LBA1524HPK showed 10-fold more RNA when measuredby Northern blot, but only 2-fold according to microarray. This suggeststhat Northern analysis was better able to quantitate gene expression athigher levels.

Discussion

Analysis of the genome sequence of L. acidophilus revealed the presenceof nine 2CRS (Altermann et al. (2004) Proc. Natl. Acad. Sci. U.S.A.102:3906-3912). All the identified histidine protein kinases showedbetween two and six transmembrane domains, suggesting their location inthe cell membrane. One of the identified 2CRS's, LBA1524HPK-LBA1525RR,showed homology to lisRK, a signal transduction system previously shownto participate in stress response and virulence in L. monocytogenes(Cotter et al. (1999) J. Bacteriol. 181:6840-6843). When weinsertionally interrupted LBA1524HPK, log-phase cells became moresensitive to acid pH. We previously reported that L. acidophilus inducesan adaptive response at pH 5.5 that provides elevated acid tolerance tothe cells (Azcarate-Peril et al. (2004) Appl. Environ. Microbiol.70:5315-5322). Both, the HPK mutant and the control NCFM::lacL exhibitedan acid induced tolerance response (ATR), although this response wasslightly impaired by the LBA1524HPK mutation. This indicates that whileLBA1524HPK-LBAi525RR plays some role, additional mechanisms contributeto acid adaptation in L. acidophilus that are not regulated by this2CRS.

A whole genome array comparing the expression profile between thecontrol and the HPK mutant revealed an altered expression pattern ofnumerous ORFs encoding genes for major components of the proteolyticenzyme system. Based on its genome sequence, L. acidophilus has alimited capacity to synthesize amino acids, with the potential tosynthesize only three amino acids (cysteine, serine, and aspartate) denovo. Additionally, cysteine and serine could be synthesized frompyruvate, and aspartate from fumarate. Based on these three amino acids,a series of other derivatives might be generated (asparagine, threonine,glycine, lysine, methionine, glutamine and glutamate). However, neitherde novo or conversion pathways were predicted for the remaining 13 aminoacids (Altermann et al. (2004) Proc. Natl. Acad. Sci. U.S.A.102:3906-3912). Therefore, amino acid requirements must be satisfied bythe uptake of amino acids and oligopeptides. L. acidophilus encodes twoputative oligopeptide transporting systems (Altermann et al. (2004)Proc. Natl. Acad. Sci. U.S.A. 102:3906-3912), opp1 (ORFs LBA0197 toLBA0203) and opp2 (ORFs LBA1300 to LBA1306). As well, six additionalgenes coding for periplasmic substratebinding proteins (OppA) wereidentified (LBAI216, LBA1347, LBA1400, LBA1665, LBA1958, and LBA1961).One major function of oligopeptide transport (Opp) systems for bacterialcells is to internalize peptides to be used as carbon and nitrogensources. They are also involved in the recycling of the cell wallpeptides, which are likely one of the first targets of physiochemicalstress. Opp systems are members of the ABC transporters family andusually consist of two ATP-binding proteins, two transmembrane proteins,and an extracellular ligand-specific binding protein. In gram-positivebacteria, the substrate-binding protein aligns with the external face ofthe cytoplasmic membrane (Sutcliffe and Russell (1995) J. Bacteriol.177:1123-1128) and biochemical evidence suggests that they have achaperone-like function in protein folding, protection against thermaldenaturation, and interaction with unfolded proteins (Richarme andCaldas (1997) J. Biological Chem. 272:15607-15612). Since severalcomponents of the proteolytic system were overexpressed, we expectedthat the HPK mutant would be able to grow better in milk than thecontrol. On the contrary, the mutant was not able to acidify 10% skimmilk (SM) under pH 5.0. However, when SM was supplemented with yeastextract both the parent and the mutant were stimulated to the samedegree. Yeast extract is the water-soluble portion of autolyzed yeast,containing vitamin B complex. It provides vitamins, nitrogen, aminoacids, and carbon in growth media or supplemented milk. Furthermore,supplementation of SM with casaminoacids essentially abolisheddifferences in acidification rate between the wild type and the mutantstrains. These observations provide evidence that the proteolytic systemin the HPK mutant was debilitated. An alternative possibility is thatinactivation of the 2CRS resulted in the reduced expression of aspecific amino acid transporter. The decreased intracellularconcentration of that amino acid might trigger the cell to overexpressother options to obtain that amino acid, i.e., through peptide transportand peptidases, or through other pathways such as enzymes involved inthe biosynthesis of lysine (LBA0849 and LBA0850). Two genes encodingputative opp binding proteins (LBA1300 and LBA1665) were consistentlyunder expressed in the mutant suggesting that these transport systemsare important for the organism's ability to grow in milk. It is notclear, however, why other opp transporters present in the genome wouldnot replace any loss of capacity from the limited expression of LBA1300and LBA1665, especially when a number of these were overexpressed.

Opp systems are also related to mechanisms of signaling since theytransport signal peptides that, once inside the cell, will interact withintracellular receptors to regulate cellular functions (Lazazzera (2001)Peptides 22:1519-1527). In gram-positive bacteria, cell-density responsemechanisms are well studied. A peptide signal precursor locus istranslated into a precursor protein that is cleaved to produce anautoinducer signal that is transported out of the cell. When theextracellular concentration of the peptide signal accumulates to theminimal stimulatory level, a HPK of a 2CRS detects it and thephosphorylated RR activates the transcription of target genes (Millerand Bassler (2001) Annu. Rev. Microbiol. 55:165-199).

Interestingly, the autoinducer-2 production gene, luxS, wassignificantly overexpressed in the BPK mutant. The gene luxS isresponsible for the production of an autoinducer molecule AI-2 in Vibrioharveyi and other gram-positive and gram-negative bacteria (Shauder etal. (2001) Mol. Microbiol. 41:463-476). LuxS is the autoinducersynthase, responsible for catalysis of the final step in AI-2biosynthesis. The disruption of luxS in S. pyogenes had several effectssuggesting that it is an important component of the response machinerythat allows this strain to adapt to changing conditions during aninfection. These effects include regulation of the SpeB protease andstress response (Lyon et al. (2001) Mol. Microbiol. 42: 145-157). Thegene located upstream luxS (LBA1080) was also up-regulated in the mutantat both pH 5.5 and 4.5.

Intriguingly, the expression of the aspartate kinase (EC 2.7.2.4,LBA0850) and diaminopimelate epimerase (EC 5.1.1.7, LBA0849) wasincreased at pH 5.5 in the HPK mutant. These are key enzymes in thebiosynthesis of lysine and are organized in an operon in L. acidophilus.However, the diaminopimelate decarboxylase (EC 4.1.1.20, LBA0851),enzyme responsible for the last step in the synthesis of lysine, was notover expressed in the HPK mutant under these conditions, we suggestD,Ldiaminopimelate, instead of being converted to L-lysine, enters thepeptidoglycan biosynthesis pathway. It is unclear if the HPK mutantproduces more peptidoglycan. If so, that may contribute to the changesobserved in cell morphology and chain length. In summary, environmentalconditions that included changes in acid concentration and fluctuationsof pH were sensed by the 2CRS, LBA1524HPK. It would be expected thatthis protein then initiates a phosphorylation cascade that regulatesexpression of a number of genes in the L. acidophilus genome. Most ofthe differentially expressed genes were up regulated in the HPK mutant,suggesting that LBA1525RR may act as a repressor. The inactivation ofthis 2CRS resulted in alterations in cell morphology, acid sensitivity,ethanol sensitivity, and poor acidification rates in skim milkindicating a loss of proteolytic activity. Microarray data showed thatmore than 50% of the genes differentially expressed in the BPK mutantencode putative membrane proteins. Additionally, expression of multiplecomponents of the proteolytic enzyme system, i.e. opp transporters,permeases, and peptidases, were dramatically affected by theinactivation of the HPK, but no simple correlation of higher or lowergene expression to proteolytic activity, or the loss thereof, wasapparent.

Example 5 Genetic Characterization of an Operon Encoding theBacteriocin, Lactacin B, in Lactobacillus acidophilus NCFM

Bacteriocins are a diverse group of antimicrobial peptides produced bymicroorganisms. Their range of inhibition is narrow, typically limitedto species that inhabit the same environmental niches such as thegastrointestinal tract. Many bacteriocins are able to elicit theirlethal effects by creating pores in the cellular membrane of targetorganisms. This results in a dissipation of the proton motive force,leakage of ATP and other essential cellular ions leading to cell death.Currently, bacteriocins produced by lactic acid bacteria (LAB), inparticular, are widely used within the food industry due to theirefficacy against foodborne pathogens such as Listeria monocytogenes andClostridium botulinum. Lactacin B is a chromosomally encoded bacteriocinproduced by Lactobacillus acidophilus NCFM. Recent sequencing of theNCFM genome revealed a primary region of interest possibly responsiblefor lactacin B production, processing, and export. The overall objectiveof our study was to investigate the role of this region in lactacin Bproduction and processing

The activity of lactacin B, a bacteriocin produced by L. acidophilusNCFM, was assayed using the direct method for bacteriocin detection(Barefoot et al. (1983) Appl. Environ. Microbiol. 45:1808-1815). Zonesof inhibition indicate death of indicator strain. Bacteriocin productionby L. acidophilus NCFM and its derivatives was carried out under bothaerobic and anaerobic conditions.

Stationary phase cultures of NCFM were carried out as follows. 5 μl ofculture were aliquotted onto MRS agar plate (1.5% w/v). MRS soft agar(0.75% w/v) containing indicator strain was poured onto surface ofplate. After 19-24 hour incubation, zones of inhibition were analyzed.

The consensus genetic elements necessary for production of many LABbacteriocins have been elucidated. These elements can exist on a plasmidand/or chromosomally and include genes encoding a two-componentregulatory system, one or more structural genes encoding thepre-bacteriocin peptide, a gene encoding an immunity protein and finallyone or more genes encoding a dedicated export system responsible forexport of the bacteriocin molecule from the cell. These coordinatedprocesses yield a mature biologically active antimicrobial peptide asillustrated by Ennahar et al. (2000) FEMS Microbiol. Lett. 24: 85-106.

Previous analysis revealed that lactacin B is a 6.5 kDa bacteriocin withantagonistic activity against closely-related species; the geneticdeterminants were unknown (Barefoot et al. (1983) Appl. Environ.Microbiol. 45:1808-1815.). Recent mining of the NCFM genome revealed aregion possibly responsible for lactacin B production (Altermann et al.(2004) Proc. Natl. Acad. Sci. USA 102: 3906-12). This region is flankedby two strong terminators and includes 11 putative open reading frames(ORFs) with similarities to conventional bacteriocin machinery includinga regulation system, an immunity protein, and a dedicated ABCtransporter protein involved in bacteriocin export. Seven additionalputative open reading frames with unknown functions were also identifiedin the putative operon (data not shown). Table 4 provides a summary ofthe various open reading frames and their function.

TABLE 4 Homology ORF Size (aa) (accession no.) Proposed function LBA180353 LBA1802 63 LBA1801 38 LBA1800 47 LBA1799 440 Two-component systemRegulation of protein histidine kinase lactacin B (NP_964617) LBA1798270 Two-component system Regulation of protein histidine kinase lactacinB (NP_964619) LBA1797 83 LBA1796 720 ABC transporter permease Lactacin Bexport component (NP_964620) LBA1794 196 Transporter auxillary proteinLactacin B export (NP_964629) LBA1793 438 Immunity/modification Immunityto protein [Streptococcus Lactacin B thermophilus LMG18311] LBA1792 63LBA1791 67

In order to examine the role that this region plays in lactacin Bproduction, the gene encoding the putative ABC transporter protein(LBA1796) was functionally disrupted by homologous recombination usingthe targeted integration vector pORI28 as described by Russell et al.(2001) Appl. Environ. Microbiol. 67: 4361-4364. An 800 bp internalfragment of LBA1796 was PCR amplified and cloned into pORI28 usingXbaI/BglII sites. Subsequent transformation into NCFM containing atemperature sensitive helper plasmid (pTRK669) selects for chromosomalintegrants following a temperature increase.

Inactivation of the putative ABC transporter protein was confirmed viaSouthern hybridization analysis (data not shown) and by PCR to confirmjunction fragments using chromosomal DNA as a template (data not shown).The integrant was assayed for lactacin B activity (FIGS. 6A and B). Abacteriocin assay was performed comparing wildtype NCFM (FIG. 6A) versusNCFM integrant (FIG. 6B). Lactacin B activity was abolished in theintegrant.

ABC Transporter Protein (LabT) appears crucial for lactacin B export andactivity. It is likely that this region also encodes the geneticdeterminants for lactacin B regulation, production, and immunity.

Example 6 Characterization of a Two-Component Regulatory SystemImplicated in the Bile Tolerance of Lactobacillus acidophilus NCFM

The effectiveness of any bacterium used as a probiotic or biotheraputicvector intended to act in the intestinal tract depends on its ability tosurvive in this region where it must be able to withstand stressesimposed by the body's physicochemical defense system. These stressesinclude low pH, high osmolality, and the presence of bile (Chowdhury etal. (1996) Stress Response in Pathogenic Bacteria. Indian Journal ofBiosciences 21:149-160). Bile's amphipathic nature allows it to act as adetergent, dissolving the phospholipid membranes that surround bacterialeading to a loss of membrane integrity and cell death. In addition toits action as a detergent, bile has been shown to cause DNA damage andinduce genes involved with DNA repair (Dashkevicz et al. (1989) ApplEnviron Microbiol 55:11-6, McAuliffe et al. In Press. Appl EnvironMicrobiol.). Bacteria employ a plethora of mechanisms to respond to anddefend against bile in their environment, including mechanisms thatremove bile from the cell, modify it, and repair damage through generalstress responses. The pathway to the induction of genes that mediatethese responses is largely unknown, but may be mediated by a histidineprotein kinase-response regulator phosphorelay pathway (Begley et al. InPress. The interaction between bacteria and bile. FEMS Microbiol Rev.).A whole-genome microarray study has shown that several genes in L.acidophilus NCFM are upregulated upon exposure to 5% Oxgall (See,Example 2). Included among these was a group of six tandem genescontaining both histidine kinase and response regulator genes. Thisstudy examined this putative operon and the influence of its histidinekinase on cell growth in the presence of bile.

A microarray study of the expression of genes in L. acidophilus NCFMcells exposed to 5% Oxgall was performed that indicated the upregulationof six tandem genes (LBA1432-LBA1427) with this treatment (see, Example2). The sequenced L. acidophilus NCFM genome (Accession NC_(—)006814)indicates that LBA1430 encodes a histidine protein kinase and LBA1431encodes a response regulator gene. The function of the other genes inthis group remains largely unknown, although LBA1429 consists of 12transmembrane domains and is believed to encode a transporter (Altermannet al. (2005) Proc Natl Acad Sci USA 102:3906-12). Clone Managersoftware indicated the presence of dyad symmetry that could lead to astem loop structure, typical of a transcriptional terminator upstream ofthe putative operon. Reverse transcriptase PCR using primers designed toamplify the intergenic regions in the proposed operon was performed onRNA extracted from cells grown in MRS with no Oxgall or MRS+0.3% Oxgallfor 1.5 hours. PCR amplification of cDNA from the intergenic regionsindicated that these genes are cotranscribed into RNA.

The putative histidine kinase gene in the six-gene operon was selectedfor inactivation in order to investigate its role in bile tolerance inL. acidophilus NCFM. This inactivation was carried out by insertion ofan erythromycin cassette by the method of Russell and Klaenhammer,utilizing the Ori+ and RepA− integration plasmid, pORI28 (Flahaut et al.(1996) Appl. Environ. Microbiol. 62:2416-2420). The integration vectorwas created through ligation of a BglII and XbaI digested pORI28 withBglII and XbaI digested PCR fragment of LBA1430. The resulting plasmid,pTRK843 was transformed into L. acidophilus NCFM containing pTRK669, aplasmid containing a functional repA gene and a chloramphenicolresistance cassette. A temperature shift from 37° to 42° resulted in theloss of pTRK669 and selection for clones where pTRK843 had integratedinto the genome. Integration of pTRK843 was confirmed by Southern blot(data not shown).

Cells were grown anaerobically in MRS with 0.0%, 0.3%, or 0.5% Oxgallfor 15 hours with OD600 measurements taken every 15 minutes. Theresulting growth curves showed decreasing ability for the histidinekinase mutant to grow as the concentration of Oxgall in the mediumincreases as compared to the wild type strain. An ABC transporter,LBA1796, knockout mutant (Bac) was used in this study as a control soerythromycin pressure could be maintained indicating the continuedpresence of the insertional knockout (A. Dobson, personalcommunication). The maximum specific growth rate (μmaxh-1) for the HPKmutant was significantly different than for the controls when grown inMRS with 0.3% and 0.5% Oxgall. See FIG. 7.

Growth curve experiments were also performed using 0.3% of individualbile salts: taurocholic acid, taurodeoxycholic acid,taurochenodeoxycholic acid, glycocholic acid, glycodeoxycholic acid.Sodium taurodeoxycholate was the only salt that affected the growth ofthe HPK mutant as compared to wild type, however the maximum specificgrowth rate between the strains were not significantly different withany of the salts. See FIG. 8.

It is known that some lactobacilli and bifidobacteria strains, includingL. acidophilus NCFM, possess the ability to deconjugate bile salts, orseparate their amino acid moiety from the cholesterol backbone(Gilliland et al. (1977) Appl Environ Microbiol 33:15-8). Although therole of this process in bacteria is not clear, it is believed to confersome positive effect on the cell including protection against the toxiceffects of bile (Flahaut et al. (1996) Appl. Environ. Microbiol.62:2416-2420). Functional analysis of the bile salt hydrolase genes hasshown that the bshB gene (LBA1078) deconjugates sodium taurodeoxycholate(McAuliffe et al. In Press. Appl Environ Microbiol.). Since the growthof the HPK mutant was decreased in this particular salt, it is possiblethat genes controlled by this particular HPK include the bshB gene. Inorder to determine if the HPK mutant retained the ability to deconjugatethis bile salt, cells were plated onto MRS agar with 0.3% of each of thesalts used in the growth experiments. Zones of clearing surrounding thecolonies indicated the activity of the bile salt hydrolases (Dashkeviczet al. (1989) Appl Environ Microbiol 55:11-6). No difference indeconjugation was seen between the wild type L. acidophilus NCFM and theHPK mutant strain.

It has been proposed that since sodium taurodeoxycho late is morehydrophobic than other bile salts, that it imposes a more disruptiveeffect on bacterial cell membranes (Sung et al. (1993) Dig Dis Sci38:2104-12). Since this particular salt lowers growth of the HPK mutantas compared to the wild type and control strains, it is possible thatgenes regulated by this particular histidine kinase encode proteins thatmay counteract this disruptive effect.

LBA1427-1432 in L. acidophilus NCFM constitute an operon involved inbile tolerance. LBA1430 in L. acidophilus NCFM encodes a histidinekinase involved in bile tolerance. Loss of histidine kinase activityfrom LBA1430 leads to a decreased ability of cells to grow in thepresence of bile. Sodium taurodeoxycholate has a more inhibitory effecton the growth of the HPK mutant than other salts tested.

All publications, patents and patent applications mentioned in thespecification are indicative of the level of those skilled in the art towhich this invention pertains. All publications, patents and patentapplications are herein incorporated by reference in their entireties tothe same extent as if each individual publication, publication or patentapplication was specifically and individually indicated to beincorporated by reference for the teachings disclosed in the sentenceand/or paragraph in which the publication, patent or patent applicationis cited.

Although the foregoing invention has been described in some detail byway of illustration and example for purposes of clarity ofunderstanding, it will be obvious that certain changes and modificationsmay be practiced within the scope of the appended claims.

1. An isolated nucleic acid molecule comprising: a) a nucleic acidmolecule comprising the nucleotide sequence of SEQ ID NO:1, 3, 5, 7, 9,11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73,75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117,119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155, 157,159, 161 or 163; b) a nucleic acid molecule comprising a nucleotidesequence having at least 80% sequence identity to the nucleotidesequence of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27,29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91,93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123, 125, 127, 129, 143,145, 147, 149, 151, 153, 155, 157, 159, 161 or 163; c) a nucleic acidmolecule that encodes a polypeptide comprising the amino acid sequenceof SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96,98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130, 144, 146,148, 150, 152, 154, 156, 158, 160, 162 or 164; d) a nucleic acidmolecule comprising a nucleotide sequence encoding a polypeptide havingat least 80% amino acid sequence identity to the amino acid sequence ofSEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96,98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130, 144, 146,148, 150, 152, 154, 156, 158, 160, 162 or 164; or e) a complement of anyof (a)-(d).
 2. A vector comprising the nucleic acid molecule of claim 1.3. The vector of claim 2, further comprising a nucleic acid moleculeencoding a heterologous polypeptide.
 4. A cell comprising the vector ofclaim
 2. 5. The cell of claim 4 that is a bacterial cell.
 6. An isolatedpolypeptide comprising: a) a polypeptide comprising the amino acidsequence of SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26,28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90,92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130,144, 146, 148, 150, 152, 154, 156, 158, 160, 162 or 164; b) apolypeptide comprising an amino acid sequence having at least 80%sequence identity to the amino acid sequence of SEQ ID NO:2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54,74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116,118, 120, 122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156,158, 160, 162 or 164, wherein said polypeptide retains activity; c) apolypeptide encoded by the nucleotide sequence of SEQ ID NO:1, 3, 5, 7,9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53,73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115,117, 119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155,157, 159, 161 or 163; or d) a polypeptide that is encoded by a nucleicacid molecule comprising a nucleotide sequence having at least 80%sequence identity to the nucleotide sequence of SEQ ID NO:1, 3, 5, 7, 9,11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73,75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117,119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155, 157,159, 161 or 163, wherein said polypeptide retains activity.
 7. Thepolypeptide of claim 6, further comprising heterologous amino acidsequences.
 8. An antibody that selectively binds to a polypeptide ofclaim
 6. 9. A method for producing a polypeptide, comprising culturingthe cell of claim 4 under conditions in which a nucleic acid moleculeencoding the polypeptide is expressed, said polypeptide comprising: a) apolypeptide comprising the amino acid sequence of SEQ ID NO: 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54,74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116,118, 120, 122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156,158, 160, 162 or 164; b) a polypeptide encoded by the nucleic acidsequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25,27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81, 83, 85, 87, 89,91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123, 125, 127, 129,143, 145, 147, 149, 151, 153, 155, 157, 159, 161 or 163; c) apolypeptide comprising an amino acid sequence having at least 80%sequence identity to the amino acid sequence of SEQ ID NO: 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54,74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116,118, 120, 122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156,158, 160, 162 or 164, wherein said polypeptide retains activity; or d) apolypeptide encoded by a nucleotide sequence having at least 80%sequence identity to the nucleotide sequence of SEQ ID NO: 1, 3, 5, 7,9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53,73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115,117, 119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155,157, 159, 161 or 163, wherein said polypeptide retains activity.
 10. Amethod for detecting the presence of a polypeptide in a sample,comprising contacting the sample with a compound that selectively bindsto the polypeptide and detecting the binding of the compound to thepolypeptide, wherein said polypeptide comprises: a) a polypeptideencoded by the nucleotide sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13,15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77,79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119,121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155, 157, 159,161 or 163; b) a polypeptide comprising the amino acid sequence of SEQID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34,36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98,100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130, 144, 146, 148,150, 152, 154, 156, 158, 160, 162 or 164; c) a polypeptide encoded by anucleic acid sequence having at least 80% sequence identity to thenucleotide sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81, 83, 85,87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123, 125, 127,129, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161 or 163, whereinsaid polypeptide retains activity; or d) a polypeptide comprising anamino acid sequence having at least 80% sequence identity to the aminoacid sequence of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,26, 28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88,90, 92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128,130, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162 or 164, whereinsaid polypeptide retains activity.
 11. The method of claim 10, whereinthe compound that binds to the polypeptide is an antibody.
 12. A kitcomprising a compound for use in the method of claim 10 and instructionsfor use.
 13. A method for detecting the presence of the nucleic acidmolecule of claim 1 in a sample, comprising: a) contacting the samplewith a nucleic acid probe or primer that selectively hybridizes to thenucleic acid molecule; and, b) detecting hybridization of the nucleicacid probe or primer with the nucleic acid molecule.
 14. The method ofclaim 13, wherein the sample comprises mRNA molecules and is contactedwith a nucleic acid probe.
 15. A kit comprising a nucleic acid moleculethat selectively hybridizes to the nucleic acid molecule of claim 1 andinstructions for use.
 16. A method for increasing the ability of amicroorganism to survive stressful conditions, comprising introducinginto said microorganism a nucleic acid molecule comprising at least onenucleotide sequence comprising: a) the nucleotide sequence of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37,49, 51 53, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101,113, 115, 117, 119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151,153, 155, 157, 159, 161 or 163; b) a nucleotide sequence encoding apolypeptide comprising the amino acid sequence of SEQ ID NO: 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54,74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116,118, 120, 122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156,158, 160, 162 or 164; c) a nucleotide sequence that is at least 80%identical to the sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17,19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81,83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123,125, 127, 129, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161 or 163,wherein said nucleotide sequence encodes a polypeptide that retainsactivity; or, d) a nucleotide sequence encoding a polypeptide comprisingan amino acid sequence having at least 80% sequence identity to theamino acid sequence of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20,22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84,86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 124, 126,128, 130, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162 or 164,wherein said polypeptide retains activity.
 17. The method of claim 16,wherein said stressful conditions comprise osmotic stress.
 18. A methodfor enhancing the ability of a microorganism to survive passage throughthe gastrointestinal tract, comprising introducing into saidmicroorganism a nucleic acid molecule comprising at least one nucleotidesequence comprises: a) the nucleotide sequence of SEQ ID NO: 1, 3, 5, 7,9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53,73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113, 115,117, 119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153, 155,157, 159, 161 or 163; b) a nucleotide sequence encoding a polypeptidecomprising the amino acid sequence of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14,16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78,80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116, 118, 120,122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156, 158, 160,162 or 164; c) a nucleotide sequence that is at least 80% identical tothe sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23,25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81, 83, 85, 87,89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123, 125, 127,129, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161 or 163, whereinsaid nucleotide sequence encodes a polypeptide that retains activity;or, d) a nucleotide sequence encoding a polypeptide comprising an aminoacid sequence having at least 80% sequence identity to the amino acidsequence of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26,28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90,92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130,144, 146, 148, 150, 152, 154, 156, 158, 160, 162 or 164, wherein saidpolypeptide retains activity.
 19. A method for increasing the ability ofa microorganism to survive in the presence of an antimicrobial,comprising introducing into said microorganism a nucleic acid moleculecomprising at least one nucleotide sequence comprising: a) thenucleotide sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, 29, 31, 33, 35, 37, 49, 51 53, 73, 75, 77, 79, 81, 83, 85,87, 89, 91, 93, 95, 97, 99, 101, 113, 115, 117, 119, 121, 123, 125, 127,129, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161 or 163; b) anucleotide sequence encoding a polypeptide comprising the amino acidsequence of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26,28, 30, 32, 34, 36, 38, 50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90,92, 94, 96, 98, 100, 102, 114, 116, 118, 120, 122, 124, 126, 128, 130,144, 146, 148, 150, 152, 154, 156, 158, 160, 162 or 164; c) a nucleotidesequence that is at least 80% identical to the sequence of SEQ ID NO: 1,3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 49,51 53, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 113,115, 117, 119, 121, 123, 125, 127, 129, 143, 145, 147, 149, 151, 153,155, 157, 159, 161 or 163, wherein said nucleotide sequence encodes apolypeptide that retains activity; or, d) a nucleotide sequence encodinga polypeptide comprising an amino acid sequence having at least 80%sequence identity to the amino acid sequence of SEQ ID NO: 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50, 52, 54,74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 114, 116,118, 120, 122, 124, 126, 128, 130, 144, 146, 148, 150, 152, 154, 156,158, 160, 162 or 164, wherein said polypeptide retains activity.
 20. Amethod for enabling an organism to respond to an environmental stimuli,comprising introducing into said organism a vector comprising at leastone nucleotide sequence encoding a histidine kinase comprising SEQ IDNOS:4, 8, 14, 16, 20, 24, 30, 34, or 36, and a response regulatorcomprising of SEQ ID NOS:2, 6, 10, 12, 18, 22, 26, 28, 32, or
 38. 21.The method of claim 20, wherein said environmental stimuli is selectedfrom the group consisting of turgor pressure, a chemical stimulus,heavy-metal cations, oxygen, iron, an antimicrobial, and glucose.